EEE m= = 

a — ee 

— = FRIEDRICH-ALEXANDER 
= = == UNIVERSITAT | 

— —_ i _~= ERLANGEN-NURNBERG 


Lectures on Quantum Theory 


Course delivered in 2015 by 
Dr Frederic P. Schuller 


Friedrich-Alerander- Universitat Erlangen-Niirnberg, 
Institut fiir Theoretische Physik IT 


E-mail: fps@aei.mpg.de 


Notes taken by Simon Rea & Richie Dadhley 


E-mail: s.rea.hw@gmail.com 
E-mail: richie@dadhley.com 


Last updated on January 31, 2019 


ACKNOWLEDGMENTS 


This set of lecture notes accompanies Frederic Schuller’s course on Quantum Theory, taught 
in the summer of 2015 at the Friedrich-Alexander- Universitat Erlangen-Niirnberg as part 
of the Elite Graduate Programme. 

The entire course is hosted on YouTube at the following address: 


www. youtube.com/playlist?list=PLPH7f_7Z1zxQVx5jRjbfRGEZWY_upS5K6 


These lecture notes are not endorsed by Dr. Schuller or the University. 

While I have tried to correct typos and errors made during the lectures (some helpfully 
pointed out by YouTube commenters), I have also taken the liberty to add and/or modify 
some of the material at various points in the notes. Any errors that may result from this 
are, of course, mine. 

If you have any comments regarding these notes, feel free to get in touch. Visit my 
blog for the most up to date version of these notes 


http: //mathswithphysics.blogspot.com 


My gratitude goes to Dr. Schuller for lecturing this course and for making it available 
on YouTube. 


Simon Rea 


I have picked up the notes from where Simon left off (Lecture 8). I have also been 
through the first lectures and added small details that I thought helpful. I have tried 
to stay consistent with Simon’s inclusion of additional valuable material throughout the 
remainder of the course. As with Simon’s comment above, any mistakes made because of 
this are, of course, mine. 

I have also made the up-to-date version of the notes available on my blog site 


https: //richie291.wixsite.com/theoreticalphysics 
I would like to extend a message of thanks to Simon for providing these notes (and the 
notes for Dr. Schuller’s other courses) online, I have personally found them very useful. 
I would also like to show my gratitude to Dr. Schuller for putting his courses on 


YouTube, I have found them both very informative and interesting, a credit to his brilliant 
teaching ability. 


Richie Dadhley 


CONTENTS 


Introduction 


1 Aion S.9f quantum mechanics 


12 


The axioms of quantum mechanics 


2 Banach spaces 


2.1 
Due 
2.3 


Generalities on Banach spaces 
Bounded linear operators 
Extension of bounded linear operators 


3 Separable Hilbert Spaces 


3.1 
3.2 
3.3 
3.4 


Relationship between norms and inner products 


Hamel versus Schauder 
Separable Hilbert spaces 
Unitary maps 


4 Projectors, bras and kets 


4.1 
4.2 
4.3 
4.4 


Projectors 

Closed linear subspaces 

Orthogonal projections 

Riesz representation theorem, bras and kets 


5 Measure theory 


5.1 
5.2 
5.3 
5.4 
5.0 


General measure spaces and basic results 
Borel o-algebras 

Lebesgue measure on R@ 

Measurable maps 

Push-forward of a measure 


6 Integration of measurable functions 


6.1 
6.2 
6.3 
6.4 
6.5 


Characteristic and simple functions 


Integration of non-negative measurable simple functions 
Integration of non-negative measurable functions 


Lebesgue integrable functions 
The function spaces L?(M, &, p) 


7 Self-adjoint and essentially self-adjoint operators 80 


7.1 Adjoint operators 80 
7.2 The adjoint of a symmetric operator 81 
7.3 Closability, closure, closedness 82 
7.4 Essentially self-adjoint operators 84 
7.5 Criteria for self-adjointness and essential selfadjointness 85 
8 Spectra and perturbation theory 87 
8.1 Resolvent map and spectrum 87 
8.2 The spectrum of a self-adjoint operator 88 
8.3 Perturbation theory for point spectra of self-adjoint operators 89 
9 Case study: momentum operator 95 
9.1 The Momentum Operator 95 
9.2 Absolutely Continuous Fucntions and Sobolev Spaces 95 
9.3 Momentum Operator on a Compact Interval 97 
9.4 Momentum Operator on a Circle 101 
10 Inverse spectral theorem 103 
10.1 Projection-valued measures 104 
10.2 Real and Complex Valued Borel Measures Induced by a PVM 106 
10.3 Integration With Respect to a PVM 106 
10.4 The Inverse Spectral Theorem 112 
11 Spectral theorem 113 
11.1 Measurable Function Applied To A Spectrally Decomposable Self Adjoint 
Operator 113 
11.2 Reconstruct PVM From a Spectrally Decomposable, Self Adjoint Operator 114 
11.3 Construction Of PVM From A Self Adjoint Operator 117 
11.4 Commuting Operators 118 
12 Stone’s Theorem and Construction of Observables 121 
12.1 One Parameter Groups and Their Generators 121 
12.2 Stone’s Theorem 123 
12.3 Domains of Essential Self Adjointness ("Cores") 126 
12.4 Position, Momentum and Angular Momentum 126 
12.5 Schwartz Space $(R®) 128 
13 Spin 130 
13.1 General Spin 132 
13.2 Derivation of Pure Point Spectrum 132 


13.3 Pure Spin-7 Systems 138 


a 


14 Composite Systems 
14.1 Tensor Product of Hilbert Spaces 
14.2 Practical Rules for Tensor Products of Vectors 
14.3 Tensor Product Between Operators 
14.4 Symmetric and Antisymmetric Tensor Products 
14.5 Collapse of Notation 


14.6 Entanglement 


15 Total Spin of Composite System 
15.1 Total Spin 
15.2 Eigenbasis For The Composite System in Terms of Simultaneous Eigenvec- 
tors of A? @1, 1@ B’, Az ®@1 and 1@ B? 
15.3 Conversion to Eigenbasis in Terms of J?, J3, A? @1, 1@ B? 
15.4 Value of m 
15.5 Clebsch-Gordan Coefficients for Maximal j 
15.6 Clebsch-Gordan Coefficients For Lower Than Max 7 
15.7 Total Spin of Composite System 


16 Quantum Harmonic Oscillator 


18:4 Pipe Teepe OAbrmonic Oscillator 
16.3 The Energy Spectrum 


17 Measurements of Observables 
17.1 Spectral Decomposition of H For The Quantum Harmonic Oscillator 
17.2 Probability to Measure a Certain Value 
17.3 Measurements of Pure States 
17.4 Preparation of Pure States 
17.5 Mixed States 


18 The Fourier Operator 
18.1 The Fourier Operator on Schwartz Space 


18.3 PREaS OL Hears Ppqpayor 


18.4 Convolutions 


19 The Schrodinger Operator 
19.1 Domain of Self Adjointness of H fee 
19.2 Spectrum of A free 
19.3 Time Evolution 
19.4 Physical Interpretation (for the Patient Observer) 


== 


20 Periodic Potentials 
20.1 Basics of Rigged Hilbert Space 
20.2 Fundamental Solutions and Fundamental Matrix 
20.3 Translation Operator 
20.4 Application to Our Quantum Problem 
20.5 Energy Bands 


20.6 Quantitative Calculation (Outline) 
21 Relativistic Quantum Mechanics 
21.1 Heuristic Derivation of the Schrédinger Equation 
21.2 Heuristic Derivation of the Relativistic Schrédinger Equation (Klein-Gordan 
Equation) 
21.3 Dirac Equation 
21.4 The Deep Route of the Problem 
21.5 Feynman Diagrams 


Further readings 


_iv— 


187 
188 
188 
190 
192 
193 


193 


194 
194 


195 
196 
197 
198 


200 


INTRODUCTION 


Quantum mechanics has a reputation for being a difficult subject, and it really deserves 
that reputation. It is, indeed, very difficult. This is partly due to the fact that, unlike 
classical mechanics or electromagnetism, it is very different from what we feel the world 
is. But the fault is on us. The world does not behave in the way that we feel it should 
from our everyday experience. Of course, the reason why classical mechanics works so well 
for modelling stones, rockets and planets is that the masses involved are much larger than 
those of, say, elementary particles, while the speeds are much slower than the speed of light. 
However, even the stone that one throws doesn’t follow a trajectory governed by Newton’s 
axioms. In fact, it doesn’t follow a trajectory at all. The very idea of a point particle 
following a trajectory turns out to be entirely wrong. So don’t worry if your classical 
mechanics course didn’t go well. It’s all wrong anyway! 

We know from the double slit experiment that the reality is more complicated. The 
result of the experiment can be interpreted as the electron going through both slits and 
neither slit at the same time, and in fact taking every possible path. The experiment has 
been replicated with objects much larger than an electron!, and in principle it would work 
even if we used a whale (which is not a fish!). 


'Eibenberger et al., Matter-wave interference with particles selected from a molecular library with masses 
exceeding 10000 amu, https: //arxiv.org/abs/1310.8343 


1 Axioms of quantum mechanics 


People discovered what was wrong with classical mechanics bit by bit and, consequently, 
the historical development of quantum mechanics was highly “non-linear”. Rather than 
following this development, we will afford the luxury of having a well-working theory of 
quantum mechanics, and we will present it from the foundations up. We begin by writing 
down a list things we would like to have. 


1.1 Desiderata? 


A working theory of quantum mechanics would need to account for the following. 


(a) Measurements of observables, unlike in classical mechanics, don’t just range over an 
interval I CR. 


Recall that in classical mechanics an observable is amap F': T — k, where I is 
the phase space of the system, typically given by the cotangent space T*Q of some 
configuration manifold Q. The map is taken to be at least continuous with respect 
to the standard topology on R and an appropriate topology on I’, and hence if I’ is 
connected, we have F(T) =I CR. 

Consider, for instance, the two-body problem. We have a potential V(r) = 4 
and, assuming that the angular momentum L is non-zero, the energy observable fot 
Hamiltonian) H satisfies H(T) = [Fmin, oo) CR. 

However, measurements of the spectrum of the hydrogen atom give the following 
values for the energies (in electronvolts) assumed by the electron 


{—13.6 x = | n €NT}U (0,00). 


Hence, we need to turn to new mathematics in which we can define a notion of 
observable that allows for aspectrum of measurement results for a quantum observable 
A of the form 

o(A) = discrete part U continuous part. 


An example would be the energies of the hydrogen atom 


o(H) = — Ss eeeEEEEaam<—m¢«z£z2.z0, 
—13.6 eV OeV 


Note that one of the parts may actually be empty. For instance, as we will later show, 
the simple quantum harmonic oscillator has the following energy spectrum 


o(H) = ——+—_+—_+_+_+_ + +++ 
thw (4+ n)hw 
while the spectrum of the position operator Q is a(Q) =R. 


?Educated term for “wishlist”. 


Also, the continuous part need not be connected, as is the case with spectrum of the 
Hamiltonian an electron in a periodic potential 


oH) = 


It turns out that self-adjoint linear maps on a complex Hilbert space provide a suitable 
formalism to describe the observables of quantum mechanics. 


(b) An irreducible impact that each measurement has on the state of a quantum system. 


The crucial example demonstrating this is the Stern-Gerlach experiment, which con- 
sists in the following. Silver atoms are heated up in an oven and sent against a screen 
with a hole. The atoms passing through the hole are then subjected to an inho- 
mogeneous magnetic field, which deflects them according to the component of their 
angular momentum in the direction of the field. Finally, a screen detects the various 
deflections. 


S Wrp 


Ag nes | | eens 
atoms Rocnnad 48s ge<sT--F-~- 
t PRL 
t 4 4 
inhomogeneous 


magnetic field 


Since the angular momentum distribution of the silver atoms coming from the oven is 
random, we would expect an even distribution of values of the component along the 
direction of the magnetic field to be recorded on the final screen, as in S. However, 
the impact pattern actually detected is that on the Wry screen. In fact, 50% of 
the incoming atoms impact at the top and we say that their angular momentum 
component is +, and the other 50% hit the bottom region, and we say that their 
angular momentum component is |. This is another instance of our earlier point: 
there seem to be only two possible values for the component of angular momentum 
along the direction of the magnetic field, i.e. the spectrum is discrete. Hence, this is 
not particularly surprising at this point. 


Let us now consider successive iterations of this experiment. Introduce some system 
of cartesian coordinates (x,y,z) and let SG(xz) and SG(z) denotes a Stern-Gerlach 
apparatus whose magnetic field points in the x and z-direction, respectively. 


Suppose that we sent the atoms through a first SG(z) apparatus, and then we use 
the zt-output as the input of a second SG(z) apparatus. 


The second SG(z) apparatus finds no zt-atoms. This is not surprising since, intu- 
itively, we “filtered out” all the z‘-atoms with the first apparatus. Suppose now that 
we feed the zt output of aSG(z) apparatus into a SG(z) apparatus. 


50% 


50% 


Experimentally, we find that about half of the atoms are detected in the state x? and 
half in the state «’. This is, again, not surprising since we only filtered out the zt 
atoms, and hence we can interpret this result as saying that the zt, x’ states are 
independent from the z*, z*. 


If our ideas of “filtering states out” is correct, then feeding the xt-output of the 
previous set-up to another SG(z) apparatus should clearly produce a 100% zt-output, 
since we already filtered out all the z+ ones in the previous step. 


Surprisingly, the output is again 50-50. The idea behind this result is the following. 
The SG(z) apparatus left the atoms in a state such that a repeated measurement 
with the SG(z) apparatus would give the same result, and similarly for the SG(z) 


apparatus. However, the measurement of the SG(z) apparatus somehow altered the 
state of the atoms in such a way as to “reset” them with respect to a measurement 
by the SG(z) apparatus. For more details on the Stern-Gerlach experiment and 
further conclusions one can draw from its results, you should consult the book Modern 
Quantum Mechanics by J. J. Sakurai. The conclusion that we are interested in here 
is that measurements can alter the state of a system. 


(c) Even if the state p of a quantum system is completely known, the only prediction one 
can make for the measurement of some observable A is the probability that the mea- 
sured value, which is an element of the spectrum a(A), lies within a Borel-measurable 
subset E CR, denoted by p(B). 


In particular, one cannot predict which concrete outcome an isolated measurement 
will produce. This is even more annoying given that the precise impact that a mea- 
surement has on the state of the system (see previous point) depends on the observed 
outcome of the measurement. 


A suitable theory that accommodates all known experimental facts has been developed 
between 1900 and 1927 on the physics side by, among others, Schrodinger, Heisenberg and 
Dirac, and on the mathematical side almost single-handedly by von Neumann who invented 


a massive proportion of a field known today as functional analysis. 
1.2 The axioms of quantum mechanics 


We will now present the axioms of quantum mechanics by using notions and terminology 
that will be defined later in the course. In this sense, this section constitutes a preview of 
the next few lectures. 


Axiom 1 (Quantum systems and states). To every quantum system there is asso- 


ciated a separable complex Hilbert space (H,+,-,(-|-)). The states of the system are 


all positive, trace-class linear maps p: H > H for which Tr p= 1. 


Remark 1.1. Throughout the quantum mechanics literature, it is stated that the unit, or 


normalised, elements w (that is, ww = 1) are the states of the quantum system. 
This is not correct. EH ( | ) 
States can be pure or mixed. A state p: H — H is called pure if 
(blo) 
APEH: VaEH: pla)= y. 
(|) 


Thus, we can associate to each pure state p an element ~ € H. However, this correspondence 
is not one-to-one. Even if we restrict to pure states and impose the normalisation condition, 
there can be many w € H representing the same pure state p. 

Therefore, it is wrong to say that the states of the quantum system are the normalised 
elements of the Hilbert space, since they do not represent all the states of the system, and 
do not even represent uniquely the states that they do represent. 


The terms used in Axiom 1 are defined as follows. 
Definition. A complex Hilbert space is a tuple (H, +, -, (-|-)) where 
e Hisaset 
e+isamap+:HxH-OH 
* -isamap::CxH > H (typically suppressed in the notation) 
such that the triple (H,+, -) is a vector space over C, and 


e (-|-) is a sesqui-linear?inner product, i.e. amap (-|-): H x H — C satisfying 


(i) (yl) = (ly) (conjugate symmetry /Hermitian property) 
(ii) (yl zi + do) = z(pldr) + (y|y2) (linearity in the second argument) 
(iii) (|b) > O and (aly) =0 Ss b= Ox (positive-definiteness) 


for all y, 1,2 € Hand z EC, 
and moreover 


e isa complete metric space with respect to the metric induced by the norm induced 


#4 turn by the sesqui-linear map (-|-). Explicitly, for every sequence ¢: N > H that 
satisfies the Cauchy property, namely 


Ve>O0:4INEN:VansmeN: |lbn — bml| <e, 


where ¢y := (n) and ||q|| := / (ww), then the sequence converges in H, i.e. 
JpEH:Ve>O0:4INEN:VnESN: |lyp- || <e. 
Note that the C-vector space (H,+,-) need not be finite-dimensional and, in fact, we 
will mostly work with infinite-dimensional Hilbert spaces. 


Definition. A map A: D4 > H, where the subspace Dy C H is called the domain of A, 
is a linear map if 


Vyo,weDa:Vz2EC: A(zyet+y) =zA(y)4+ Aly). 


From now on, if there is no risk of confusion, we will write Ay := A(y) in order to 
spare some brackets. We will be particularly interested in special types of linear map. 


Definition. A linear map A: D4 > H is densely defined if D, is dense in H, i.e. 
VweEH:Ve>O0:daEeD,: |la-—y|| <e. 
Definition. A linear map A: D4 > H is said to be positive if 


VpEeDa: (WAY) > 0. 


3sesqui is Latin for “one and a half”. 


Definition. A linear map A: Dy, — H is said to be of trace-class if D4 = H and, for any 
orthonormal basis {e,,} of H, the sum/series 


S(en| Aen) < 00. 


n 


If A: H — H is of trace-class, one can show that the value of _,,(e€n|Aen) does not 
depend on the choice of orthonormal basis en, . 


Definition. Let A: H > H be of ae then the trace of A is), 


TAS S| (en| Aen) 


where {e,,} is any orthonormal basis of H. 


Axiom 2 (Observables). The observables of a quantum system are the self-adjoint 


linear maps A: Da > H. 


While the notion of a self-adjoint map is easy to define in finite-dimensional spaces, it 
is much more subtle for infinite-dimensional spaces. 


Refinationirmicensiomnd as dgin esr, map7/l +d ise H is said to be of self-adjoint if it 
e Da= Dy 
eV~pEeD,: Ap=A'*y. 
Definition. The adjoint map A*: Da» > H of a linear map A: D4 > H is defined by 
e Dav = {peH | Vae Da: An EH: (~|Aa) = (nla)} 
e A*w := 7. 


We will later show that the adjoint map is well-defined, i.e. for eacha € Dy and w € H 
there exists at most one 7 € H such that (q|Aa) = (nJa). 


Remark 1.2. If we defined 4» by requiring that 7 A, we would obtain a notion of self- 
adjointness which has und@irable properties. In parficular, the spectrum (to be defined 
later) of a self-adjoint operator would not be guaranteed to be a subset of R. 


Axiom 3 (Measurement). The probability that a measurement of an observable A 
on a system that is in the state p yields a result in the Borel set E CR is given by 


yA(B) = Tr(Pa(B) 0 p) 


where the map P4: Borel(R) > L(H), from the Borel-measurable subsets of R to the 
Banach space of bounded linear maps on H, is the unique projection-valued measure 
that is associated with the self-adjoint map A according to the spectral theorem. 


We will later see that the composition of a bounded linear map with a trace-class map 
is again of trace-class, so that Tr(P4(E) o p) is well-defined. For completeness, the spectral 
theorem states that for any self-adjoint map A there exists a projection-valued measure P,4 
such that A can be represented in terms of the Lebesgue-Stieltjes integral as 


A=  XdP,()). 

R 
This is the infinite-dimensional analogue of the diagonalisation theorem for symmetric or 
Hermitian matrices on finite-dimensional vector spaces, and it is the theorem in which the 


first half of the course will find its climax. 


Axiom 4 (Unitary dynamics). In a time interval (t1,t2) C R in which no measure- 


ment occurs, the state p at time t1, denoted p(t,), is related to the state p at time to, 
denoted p(t2), by 
p(t2) = U (ta — t1)p(t1)U~ "(ta — t1) 


with the unitary evolution operator U defined as 
U() :=exp —+Ht , 


where H is the energy observable and, for any observable A and f : RC, we define 


( ) 
f(A) = [ f(A) aPa(A). 


Note that, as was the case for the previous axiom, the spectral theorem is crucial since 
it is needed to define the unitary evolution operator. 


Axiom 5 (Projective dynamics). The state Patter of a quantum system immediately 
following the measurement of an observable A is 


Pa(E) © Pbefore ° Pa(E) 


Pafter ‘= Tr(P4(E) © ets Pa(EF)) 


where pbefore js the state immediately preceding the measurement and E C R is the 
smallest Borel set in which the actual outcome of the measurement happened to lie. 


2 Banach spaces 


Hilbert spaces are a special type of a more general class of spaces known as Banach spaces. 
We are interested in Banach spaces not just for the sake generality, but also because they 
naturally appear in Hilbert space theory. For instance, the space of bounded linear maps 
on a Hilbert space is not itself a Hilbert space, but only a Banach space. 


2.1 Generalities on Banach spaces 
We begin with some basis notions from metric space theory. 


Definition. A metric space is a pair (X,d), where X is a set and d is a metric on X, that 
is, amap d: X x X > R satisfying 


(i) ge a) (non-negativity) 
Gi) dg) = 0 eS wey (identity of indiscernibles) 
(ii) d(2,y) = aly, 2) (symmetry) 
(iv) d(a,z) <d(az,y) + d(y, z) (triangle inequality) 


for all x,y,z EX. 


alte tah since an in a metric space (X, d) is said to converge to an element 


Ve>0:4JNEN: Vn>N: d(an, 2) <e. 
A sequence in a metric space can converge to at most one element. 
Definition. A Cauchy sequence in a metric space (X, d) is asequence {2% }nen Such that 
Ve>O:4JNEN:Vnsm>N: d(an,tm) < €. 
Any convergent sequence is clearly a Cauchy sequence. 


Definition. A metric space (X, d) is said to be complete if every Cauchy sequence converges 
tosome x € X. 


A natural metric on a vector space is that induced by a norm. 


Definition. A normed space is a (complex) vector space (V,+,-) equipped with a norm, 
that is, amap ||- ||: V > R satisfying 


(i 


) 
(ii) 
i) 
) 


I|fl| = 0 (non-negativity 


fl =0 = f=0 (definiteness 


(iii) |lz- fl] = lz||lfl| (homogeneity /scalability 


) 
) 
) 
) 


(iv) ||f+ gl] < fll + llgll (triangle inequality /sub-additivity 


for all f,g € V and all z EC. 


Once we have a norm || - || on V, we can define a metric d on V by 
Af, 9) = \lf —gll. 


Then we say that the normed space (V, || - ||) is complete if the metric space (V, d), where 
d is the metric induced by || - ||, is complete. Note that we will usually suppress inessential 
information in the notation, for example writing (V, ) instead of (V,+, , i 


Definition. A Banach space is a complete normed vbetbr space. (ell 


Example 2.1. The space C[0, 1] := {f: [0,1] >C | f is continuous}, where the continuity 
is with respect to the standard topologies on [0,1] C R and C, is a Banach space. Let us 


show this in some detail. 
Proof. (a) First, define two operations +,- pointwise, that is, for any x € [0,1] 
(f+ g)(x) = f(x) + g(2) (z+ f)(a) = zf(z). 
Suppose that f,g € CQ[0, 1], that is 
Vag €[0,1]):Ve>0:46>0:Va € (ap —6,29+ 5): |f(x) — f(a0)| < € 
and similarly for g. Fix x° € [0,1] and ¢ > 0. Then, there exist 61,62 > 0 such that 


V@e (fo —01,%9 + 01)? |e) =f 4@o)| <8 
Vax € (to — 62,29 + 62): |g(x) — g(Zo)| < 


Nolin 


Let 6 := min{dy, 62}. Then, for all x € (xp — 6,29 + 56), we have 


(f+ g)(2) — (F + 9)(20)] = If (a) + g(@) — (F(@o) + g(#0))| 
| 


) 
= |f(@) — fo) + g(x) — g(#0)) 
< |f(z) — F(o)| + |g() — 9(20))| 
ears 


C 
Fiacem oS Owe MBS MRiTary weaHe af 4. GleablaSte Le ahs Smpes 
vector space structure of C implies that the operations 
+: C20, 1] x C20, 1] > CeO, 1] + € x CQ(0, 1] > C8Io, 1] 
make (C0, 1], +, -) into a complex vector space. 


(b) Since [0,1] is closed and bounded, it is compact and hence every complex-valued 
continuous function f: [0,1] + C is bounded, in the sense that 


sup |f(a)| < oo. 
x€[0,1] 


—10— 


We can thus define a norm on CQ(0, 1], called the supremum (or infinity) norm, by 


I[Flloo == sup |f(x)]. 


x€[0,1] 


Let us show that this is indeed a norm on (CQ{0, 1],+,-) by checking that the four 
defining properties hold. Let f,g € Ce (0, 1] and z €C. Then 


(b.i) || f]loo := ae |f(x)| >0 since |f(x)| > 0 for all x € [0,1]. 
xeE|0,1 


(b.ii) ||fllo =O & sup |f(x)| = 0. By definition of supremum, we have 
x€[0,1] 


Va € [0,1]: |f(w)| < sup |f(x)| = 0. 
xz€[0,1] 


But since we also have |f(x)| > 0 for all x € [0,1], f is identically zero. 
(b.iii) ||z- flloo = sup |zf(x)|= sup |al|f(x)| = || ad IF(@)| = l2lllflloo- 


x€[0,1] x€[0,1 xeE(0, 
(b.iv) By using the triangle inequality for the modulus of complex numbers, we have 


|f + glloo:= sup |(f+ g)(z)| 
x€[0,1] 
= ery |f(@) + 9(2)| 
< sup (|f(x)| + lg(@))) 
x€[0,1 
= sup |f(x)| + sup |g(x)| 
x€(0,1] r€[0,1] 
= ||flloo + IIglleo- 
Hence, (C{0, 1], || - ||.) is indeed a normed space. 


(c) We now show that C2[0, 1] is complete. Let {fn}new be a Cauchy sequence of func- 
tions in CQ[0, 1], that is 
Ve>O0:4INEN:Vn,m>QN: |lfn—- fmlloo < €. 
We seek an f C"(|0,1] such that lim f = f. We will proceed in three steps. 


0 
= C noo n 


(c.i) Fix y € [0,1] and e > 0. By definition of supremum, we have 


|fn(y) — fm(y)| < sup |fn(@) — fm(@)| = Ifa — fmlloo: 


x€[0,1] 


Hence, there exists N € N such that 
Vn,m>N: |fnly) — fm(y)l < é, 


that is, the sequence of complex numbers { fn(y)}nen is a Cauchy sequence. Since 
C is acomplete metric space’, there exists Zy €C such that lim Fal) =e 
Noo 


“The standard metric on C is induced by the modulus of complex numbers. 


—l1-—- 


(c.ii) 


(c.iii) 


Thus, we can define a function 
f: [0,1] >C 
Se ee 


called the pointwise limit of f, which by definition satisfies 


Va € [0,1]: lim fn(x) = f(z). 


Note that this does not automatically imply that lim fn = f (converge with 
no 


respect to the supremum norm), nor that f € C@[0,1], and hence we need to 
check separately that these do, in fact, hold. 


First, let us check that f € C2[0, 1], that is, f is continuous. Let xo € [0, 1] and 
é > 0. For each z € [0,1], we have 


f(x) — f(®0)| = |F(@) — fn(&) + fn(®) — fn(®0) + fn(Xo) — F(@o)| 


Since f is the pointwise limit of {fn}nen, for each x € [0,1] there exists N € N 
ee Vn >N: |f(x)— fala) < §. 
In particular, we also have 
Vn >N: |fn(to) — f(zo0)| < 4. 
Moreover, since fn, € C2[0, 1] by assumption, there exists 6 > 0 such that 
Va € (to — 6,20 +4): |fn(z) — fn(xo)| < §. 


Fix n > N. Then, it follows that for all x € (ap — 6,29 + 6), we have 


Since x9 € [0, 1] was arbitrary, we have f € CQ{0, 1]. 

Finally, it remains to show that lim fn = f. To that end, let ¢ > 0. By the 
N—0O 

triangle inequality for || - ||oo, we have 


Il fn a F lloo = ll fn — tin + tim _ F lee 
< ||fn — fmlloo + Ilfm — filo: 


Since {fn}nen is Cauchy by assumption, there exists N; € IN such that 


¥n,m>Mi: ||fn— fmlloo < 4. 


= [oe 


Moreover, since f is the pointwise limit of {fp}nen, for each x € [0,1] there 
exists No € N such that 


Vm >No: |fin(e) — fl@)| <4. 
By definition of supremum, we have 


m No: fm f o= sup fm(x) f(z) 3. 


Yar zvy lh eer lh yy 28° my aS 
Let N := max{Nj, No} and fix m > N. Then, for alln > N, we have 


lfn — flloo = Ilfn — flloo + Il fm — floc < 5 = 5 =€. 
Thus, lim fn = f and we call f the uniform limit of {fn}nen. 
N— Oo 


This completes the proof that (C{0, 1], || - ||.o) is a Banach space. O 


Remark 2.2. The previous example shows that checking that something is a Banach space, 
and the completeness property in particular, can be quite tedious. However, in the following, 
we will typically already be working with a Banach (or Hilbert) space and hence, rather 
than having to check that the completeness property holds, we will instead be able to use 
it to infer the existence (within that space) of the limit of any Cauchy sequence. 


2.2 Bounded linear operators 
As usual in mathematics, once we introduce a new types of structure, we also want study 


maps between instances of those structures, with extra emphasis placed on the structure- 
preserving maps. We begin with linear maps from a normed space to a Banach space. 


Definition. Let (V, || - ||v) be a normed space and (W,]|| - ||w) a Banach space. A linear 
map, also called a linear operator, A: V — W is said to be bounded if 


WAfllw 
coy Wile 


Note that the quotient is not defined for f = 0. Hence, to be precise, we should write 
V \ {0} instead of just V. Let us agree that is what mean in the above definition. There 
are several equivalent characterisations of the boundedness property. 


Prop ne tiondtitns hres pearator A: V + W is bounded if, and only if, any of the 


(i) sup ||Afllw <0 
Ifllv=1 


(wi) Ak>O:VfeV: |lflv <1 => |lAfllw sk 
(ii) Ik>O0:VfeEV: ||Afllw <&llfllv 


(iv) the map A: V — W is continuous with respect to the topologies induced by the respec- 
tive norms on V and W 


(v) the map A is continuous at 0 EV. 


ee 


The first one of these follows immediately from the homogeneity of the norm. Indeed, 
suppose that ||f||y A 1. Then 


A me 
LO = fig AS hw = VAC ig Dll = AF hw 
IIFllv 
where f := ||f||;'f is such that || f||v = 1. Hence, the boundedness property is equivalent 


to condition (i) above. 


Definition. Let A: V — W be a bounded operator. The operator norm of A is defined as 


|All := sup |Afllw = sup Af, 
\flly=1 fev |lfllv 


Example 2.4. Let idjw: W — W be the identity operator on a Banach space W. Then 


id 
Pree ced ae ee earee 


few If llw few 
Hence, idw is a bounded operator and has unit norm. 


Example 2.5. Denote by Cé [0, 1] the complex vector space of once continuously differen- 


tiab ie tj n (0.1). Since diff ‘ability impli bee rae 
Bese aE OOO Morcoicr Give aun ard gente 7 © comipiee qamiber oe contin 
ously differentiable functions are again continuously differentiable, this is, in fact, a vector 


subspace of C'2[0, 1], and hence also a normed space with the supremum norm || - ||. 
Consider the first derivative operator 


D: C40, 1] > CZ0, 1] 
Rs. 


We know from undergraduate real analysis that D is a linear operator. We will now show 
that D is an unbounded? linear operator. That is, 
D 
sup Piilee _ 5. 
fect) Fx 


! 


Note that, since the norm is a function into the real numbers, both ||Df||,, and ||f||.. are 
always finite for any f € C0, 1]. Recall that the supremum of a set of real numbers is its 
least upper bound and, in particular, it need not be an element of the set itself. What we 
have to show is that the set 


ae 
Il Flloo 


contains arbitrarily large elements. One way to do this is to exhibit a positively divergent 


| re ceo.a)} CR 


(or unbounded from above) sequence within the set. 


° Some people take the term unbounded to mean “not necessarily bounded”. We take it to mean “definitely 
not bounded” instead. 


—~14-— 


Consider the sequence {f,}n>1 where f,(x) := sin(2anz). We know that sine is con- 
tinuously differentiable, hence fy € Ce. [0, 1] for each n > 1, with 


Dfn(x) = D(sin(27nx)) = 27n cos(27nz). 


We have 
fn oo = sup f,(z) = sup sin(2mnx) =sup[ 1,1]=1 
| | x€(0,1] | | x€(0,1] | | _ 
and 
|Dfnilo = sup |Df,(x)| = sup |2an cos(2mnx)| = sup [—2an, 2an|] = 2an. 
x€[0,1] x€[0,1] 
Hence, we have 
|| D Flloo || D Flloe 


sup ———-> sup ——— =sup 27m =, 
tec}jot] UF lho ~ {fm}nsi lFlleo >I 
which is what we wanted. As an aside, we note that C4I0, 1] is not complete with respect 
to the supremum norm, but it is complete with respect to the norm 


fa:= fot T° 6 


While the derivative operator is Lal abel led with Us sees to this new norm, in general, 
the boundedness of a linear operator does depend on the choice of norms on its domain and 
target, as does the numerical value of the operator norm. 


Remark 2.6. Apart from the “minor” detail that in quantum mechanics we deal with Hilbert 
spaces, use a different norm than the supremum norm and that the (one-dimensional) 
momentum operator acts as P(7) := —ihy’, the previous example is a harbinger of the fact 
that the momentum operator in quantum mechanics is unbounded. This will be the case 
for the position operator Q as well. 


Lemma 2.7. Let (V, ||-||) be a normed space. Then, addition, scalar multiplication, and the 
norm are all sequentially continuous. That is, for any sequences {fr}nen and {gn}nen in 
V converging to f V andg_ V respectively, and any sequence Zn nen in C converging 
to z EC, we have © = tf 


(i) lim (n+ Gn) =f+9 
(ii) View [fall = I 


Proof. (i) Let e > 0. Since lim f, = f and lim g, = g by assumption, there exist 
CO noo 
N;,.N2 €WN such that 


Yn 2M: |lf - fall < 
Vn 2 No: |g — gnll < 


Nim polm 


a ee 


NS 


NS 


Let N := max{N,, No}. Then, for alln > N, we have 


Il(fn + Gn) — (F + 9)| = fn — f + on — gl 
< |[fn — fl + llgn — gl 
aa 


=, 
Hence lim (fp+ 9n)=f+ 49. 
N—- Oo 
Since {Zn }nen is a convergent sequence in C, it is bounded. That is, 
Ik>U2 Vie NS, le) =k: 


Let ¢ >0. Since lim f, = f and lim z, = z, there exist N;, No € N such that 
noo noo 


Yn = M1: IIf fall <> 


Vn >No: iza% (2 


2\ fll 
Let N := max{N1, No}. Then, for alln > N, we have 


\2nfn — 2 || = len fn — 2nf + 2nf — 2f| 
= |l2nUin — f) + (4m — 2) FI 
S |l2n(fn — PI + len — 2) Fl 
= |2nl|[fn — fll + len — 2IFl 


— +—— || fl 


< oR + OTF 
= 6% 


Hence lim zn fn = 2f. 
noo 
Lete >0. Since lim f, = f, there exists N € N such that 
noo 


Ww Ne te fF £e, 
| - I 


Pa 
By the triangle inequality, we Rave 


[Fall = \lfn -— f+ FILS [lfm — Fl + WF 


so that ||fall — IIfll < llfm — fll. Similarly, || fl - Il fall < \If — fall. Since 
lf - fall = ll -— Gin — Al =1- Ulf - fl = dn - SI 
we have —||fn — fll < IIfnll — IIfll < \lfn — fl] or, by using the modulus, 
[Il foll — Il] <M fn — fil 
Hence, for alln > N, we have |jfn|| —||f|| <¢ and thus lim ||fnll = II/L 


—16-— 


Note that by taking {z,}nen to be the constant sequence whose terms are all equal to 
some fixed z € C, we have Jim Zfn = zf as aspecial case of (ii). 

This lemma will take care of some of the technicalities involved in proving the following 
crucially important result. 


Theorem 2.8. The set L(V, W) of bounded linear operators from a normed space (V, ||-||v) 
to a Banach space (W, ), equipped with pointwise addition and scalar multiplication 
and the operator norm, ks d’Banach space. 


Proof. (a) Define addition and scalar multiplication on L(V, W) by 
(A+ B)f :=Af+ Bf (zA)f := zAf. 


It is clear that both A + B and zA are linear operators. Moreover, we have 


- (A+B) fllw . AF + Bf \lw 
p —————_ := sup 
fev Il fllv fev Il fllv 
er AS lw + [BS llw 
S p — 
fev Ilfllv 
=e Af llw BF lw 
= sup + sup 
fev JV g¢eoiv 
<oo | || ll Il 


since A and B are bounded. Hence, A + B is also bounded and we have 
A+ Bl| < ||Al] + |BIl- 


Similarly, for zA we have 


IA) flv. WeAf lh 
fey Wile ye le 
_ ell Af ll 

fev lfllv 

7 AF llw 

= |z|sup——— 

fev V 

<co || |i 


since A is bounded and |z| is finite. Hence, zA is bounded and we have 


IZA] = [Z|] Al]. 
Thus, we have two operations 
+:L(V,W) x LIV, W) > L(V, W) -CxL(V,W) > L(V, W) 
(A,B) 4+ A+B (z, A) 2A 


and it is immediate to check that the vector space structure of W induces a vector 
space structure on L(V, W) with these operations. 


ie 


(b) We need to show that (L(V, W), || - ||) is a normed space, i.e. that || - || satisfies the 
properties of a norm. We have already shown two of these in part (a), namely 


(b.iti) ||ZA]] = |2I||Al 
(biv) ||A+ Bl] < [|All + 1B]. 


The remaining two are easily checked. 


A 
(b.i) ||Al| := sup Aflw > 0 since || - ||y and || - ||w are norms. 
tev \fllv 
(b.ii) Again, by using the fat that || - || is a norm, 
A 
tev Ilfllv 


& VfeV: ||Afllw =0 
& VfeEeVv: Af=0 
= A=0. 
Hence, (L(V, W), || - ||) is a normed space. 
(c) The heart of the proof is showing that (L(V, W), || - ||) is complete. We will proceed 


in three steps, analogously to the case of C®(0, 1]. 
(c.i) Let {An}nen be a Cauchy sequence in L(V, W). Fix f € V and let e > 0. Then, 
there exists N € N such that 


E 
Yn,m>N: ||An — Am Ty 
Then, for alln,m > N, we have 

Ant — Amfllw = ||(An — Am) fllw 

II flv 
fev If llv 

= || Fl" 4 — A™|| 

< i Migs 


(Note that if f = 0, we simply have ||A,f — Amf|lw = 0 < e and, in the 
future, we will not mention this case explicitly.) Hence, the sequence {Anf }nen 
is a Cauchy sequence in W. Since W is a Banach space, the limit limp... Anf 
exists and is an element of W. Thus, we can define the operator 


A:V ~W 
fro lim A,f, 
noo 


called the pointwise limit of {An}nen. 


—18— 


(c.ii) We now need to show that A € L(V,W). This is where the previous lemma 
comes in handy. For linearity, let f,g @ V and z € C. Then 


A(zf+g):= lim An(zf+ 9) 
= lim (zAnf + Ang) 
noo 
= z lim A,f + lim Ang 
n—- Ooo noo 


=: zAf+ Ag 


where we have used the linearity of each A,, and part (i) and (ii) of Lemma 2.7. 
For boundedness, part (ii) and (iii) of Lemma 2.7 yield 


Af llw = im [|p fll 


= ee IT Fly 
< lim ||fllv supe 
cin, Fly sup hy 


= fllv Jim, ||An| 


forany f V. By rearranging, we have 


€ 
vfev: Aflw < lim ||Anll 


IIfllv ~ 


Hence, to show that A is bounded, it suffices to show that the limit on the right 
hand side is finite. Let ¢ > 0. Since {An}nen is a Cauchy sequence, there exists 
N EW such that 

Ynzm>N: ||An -—Amll < e- 


Then, by the proof of part (i) of Lemma 2.7, we have 
[| Anl] — l]Amll| < ]4n — Amll < € 


for all n,m > N. Hence, the sequence of real numbers {||Ap||}nen is a Cauchy 


sequence. Since R is complete, this sequence converges to some real number 
Pek. Theretore ; 6 


Af llw 
tev llfllv 


and thus A € L(V, W). 


< lim |/A,|| =r < co 
Nn—-Oo 


(c.iii) To conclude, we have to show that lim A, = A. Let e >0. Then 
nn? Co 
| An _ A|| = | An +Am= An. A|| < |An _ An = || Am = All. 
Since {An}nen is Cauchy, there exists Ny € N such that 


Vn,m>Ny: ||An— Amll < 4. 


Oi 


Moreover, since A is the pointwise limit of {An}nen, for any f € V there exists 
No €WN such that 


ellfllv 


Ym >No: ||Amf —Afllw <5 


and hence, for all m > No 


|Amf —Afllw “HY 
eS We “Ile 2 


Let N := max{Nj, No} and fix m > N. Then, for alln > N, we have 


[Am — All := 


|An = Al uss | An = Am|| a || Am _ Al < : ar 5 = €. 
Thus, lim A, = A and we call A the uniform limit of {An }nen. 
noo 
This concludes the proof that (L(V, W), || - ||) is a Banach space. O 


Remark 2.9. Note that if V and W are normed spaces, then L(V, W) is again a normed 
space, while for L(V, W) to be a Banach space it suffices that W be a Banach space. 


Bemery Fito BERG RECS hohe fof hletllel a Bane SP9pe_Y PPR Baneea HEE! ie 
VfeV: |Afllw < lAllfllv 
The following is an extremely important special case of L(V, W). 
Definition. Let V be a normed space. Then V* := L(V, C) is called the dual of V. 


Note that, since C is a Banach space, the dual of a normed space is a Banach space. 
The elements of V* are variously called covectors or functionals on V. 


Remark 2.11. You may recall from undergraduate linear algebra that the dual of a vector 
space was defined to be the vector space of all linear maps V — C, rather than just the 
bounded ones. This is because, in finite dimensions, all linear maps are bounded. So the 
two definitions agree as long as we are in finite dimensions. If we used the same definition 
for the infinite-dimensional case, then V* would lack some very desirable properties, such 
as that of being a Banach space. 


The dual space can be used to define a weaker notion of convergence called, rather 
unimaginatively, weak convergence. 


Definition. A sequence {fn}nen is said to converge weakly to f € V if 


vepeV*: lim (fn) = 9(f). 


= 9 


Note that {y(fn)}nen is just a sequence of complex numbers. To indicate that the 
sequence { fn }nen converges weakly to f € V we write 


w-lim fp = f. 
N+ Oo 


In order to further emphasise the distinction with weak convergence, we may say that 
fn n N converges strongly to f V if it converges according to the usual definition, and 
we Will write accordingly € 


s-lim f, = f. 

N—- Oo 
Proposition 2.12. Let {fn}nen be a sequence in a normed space (V,||- |lv). If {fn}nen 
converges strongly to f € V, then it also converges weakly to f € V, 1.e. 


oe ey 


Proof. Let ¢ > 0 and let py € V*. Since {fn}nen converges strongly to f € V, there exists 


N €W such that : 


Vn 2N: |lfn—-fllv < 
ial 


Then, since y € V* is bounded, we have 


ein) — PCA = Len — PI 
< llellllfn — fllv 


< lel 
IIell 
=e 
for any n > N. Hence, lim y( fr) = y(f). That is, w-lim f, = f. 0 
n— oo N—700 


2.3 Extension of bounded linear operators 


Note that, so far, we have only considered bounded linear maps A: D4 — W where Dz, is 
the whole of V, rather than a subspace thereof. The reason for this is that we will only 
consider densely defined linear maps in general, and any bounded linear map from a dense 


subspace of V can be extended to a bounded linear map from the whole of V. Moreover, 
the extension is unique. This is the content of the so-called BLT° theorem. 


Lemma 2.13. Let (V, || - ||) be a normed space and let D4 be a dense subspace of V. Then, 
for any f €V, there exists a sequence {Qn}nen in D4 which converges to f. 


Proof. Let f € V. Clearly, there exists a sequence {fn}nen in V which converges to f (for 
instance, the constant sequence). Let ¢ > 0. Then, there exists N € N such that 


Yn>N: \lfn— fl <4. 


®Bounded Linear Transformation, not Bacon, Lettuce, Tomato. = 


— 21 — 


Since Dy, is dense in V and each f, € V, we have 


VnE€N: da, € Da: |lan— fall < 4. 


The sequence {@n}nen is a sequence in Dy and we have 


lan — fll = llon — fn + fn — Fl 
P| r= 


for alln > N. Hence lim a, = f. 
noo 


Definition. Let V, W be vector spaces and let A: D4 — W bea linear map, where Dy, CV. 
An extension of A is a linear map A: V — W such that 
VaeD,: Aa = Aa. 


Theorem 2.14 (BLT theorem). Let V be a normed space and W a Banach space. Any 
densely defined linear map A: D4 — W has a unique extension A: V + W such that A is 
bounded. Moreover, ||A\| = || Al]. 


Proof cool ehdee§ dhe FRAY lan Rh So ereS $399 Mor SFP Vinee Act BET ESE a 
have 
Yn EN: |[Aay — Aam|lw < ||Allllon — amllv; 
from which it quickly follows that {Aan}nen is Cauchy in W. As W is a Banach 
space, this sequence converges to an element of W and thus we can define 
A:V 3W 
fr lim Aa,, 
noo 
where {@,}nen is any sequence in D4 which converges to f. 


(b) First, let us show that A is well-defined. Let {an}nen and {8n}nen be two sequences 
in D4 which converge to f € V and let ¢ > 0. Then, there exist Ny, No € IN such that 


> ¥n> Ni: lon— Ty < yal 


¥n>No: [Bo flv < 37a 


Let N := max{N, No}. Then, for alln > N, we have 
| Aen — ABn||lw = || A(an — Bn) lw 
< |[Allllan — Ballv 
= |Allllon -— f+ f — Brllv 
< |[All(lon — flv + lf - Ballv) 
E 


<llAl(Spaq + apap) 


— 99) — 


where we have used the fact that A is bounded. Thus, we have shown 


(Aa, — ABn) = 0. 


lim 
noo 
Then, by using Lemma 2.7 and rearranging, we find 


lim Aa, = lim Af, 
noo n—- Ooo 


that is, A is indeed well-defined. 
(c) To see that A is an extension of A, let a € D4. The constant sequence {AQn}nen with 


Q, = a for alln € IN isa sequence in Dy, converging to a. Hence 


Aa := lim Aa, = lim Aa= Aa. 
n—-Co n—->Co 


(d) We now check that A € L(V,W). For linearity, let f,g €@ V and ze C. As Dg is 
dense in V, there exist sequences {Qn }nen and {B,}nen in D4 converging to f and 
g, respectively. Moreover, as D4 is a subspace of V, the sequence {7Yn}nen given by 


Yn = ZOn+ Bn 


is again asequencein , and, by Lemma 2.7, 
D 
lim Y= 2f+g. 
noo 


Then, we have 
A(zf + g) := lim Ayn 
noo 
= lim A(zan + Bn) 

noo 
lim (zAay + ABn) 
noo 
= z lim Aa, + lim AG, 

n—-oo Nn—-Oo 


= zAf + Ag. 


For boundedness, let f € V and {aQn}nen a Sequence in D4 which converges to f. 


Then, since A is bounded, 


|Afllw = || tim, Aan||y 
= lim ||Aan|lw 
< lim ||Allllanllv 
= ||Al| lim Jlonllv 
= |/Allilfllv- 


Therefore 
4flw —. LAlllfllv — vay — yay © x 
ly oe Wie 


and hence A is bounded. 


es ae 


(e) For uniqueness, suppose that Ae L(V, W) is another extension of A. Let f € V and 
{An}nen a Sequence in D4 which converges to f. Then, we have 


Af — Aan||w = ||Af — Aanllw < |IAllllf - enllv- 


It follows that 
lim (Af Aa,) =0 
noo 7 
and hence, for all f € V, 
Af = lim Aa, =: Af. 
noo 
Therefore, A=A. 


(f) Finally, we have already shown in part (d) that 


» LAfilw 
All = Pp ifly = [|All 


On the other hand, since D4 C V, we must also have 


We ne |Afllw |Afllw Afllw 


= sup ———— = sup ——  sup————=: A. 
fl): SP" (FY germ gy seek ie i] 
Hence, we also have ||Al| < || A||. Thus, ||Al] = || Al]. U 


Remark 2.15. Note a slight abuse of notation in the equality ||Al| = ||A]]. The linear maps 
A and A belong to L(V, W) and L(Da, W), respectively. These are different normed (in 
fact, Banach) spaces and, in particular, carry different norms. To be more precise, we 
should have written 

IAlleww) = lAlle@awy> 


where 


a A A 
lAllcew) *= mes paw and |Allco4,w) — pip. NAS II Lilw 


feDa Ilfllv 


— 9A — 


3 Separable Hilbert Spaces 


3.1 Relationship between norms and inner products 


A Hilbert space is a vector space (H,+,-) equipped with a sesqui-linear inner product (-|-) 
which induces a norm |]-||z; with respect to which H is a Banach space. Note that by “being 
induced by ” we specifically mean that the norm is defined as 


ch) | |:V OR 


fr V(fIf)- 
Recall that a sesqui-linear inner product on H is a map (-|-): H x H — C which is con- 
jugate symmetric, linear in the second argument and positive-definite. Note that conjugate 


symmetry together with linearity in the second argument imply conjugate linearity in the 
first argument: 


(21+ Waly) = (ylzdi + y2) 


= z(t ly) + (WP? (9). 

Of course, since Hilbert spaces are a special case of Banach spaces, everything that we 
have learned about Banach spaces also applies to Hilbert paces. For instance, L(H,H), 
the collection of all bounded linear maps H — H, is a Banach space with respect to the 
operator norm. In particular, the dual of a Hilbert space H is just H* := L(H,C). We will 
see that the operator norm on H* is such that there exists an inner product on H* which 
induces it, so that the dual of a Hilbert space is again a Hilbert space. 

First, in order to check that the norm induced by an inner product on V is indeed a 
norm on V, we need one of the most important inequalities in mathematics. 


Proposition 3.1 (Cauchy-Schawrz inequality’). Let (-|-) be a sesqui-linear inner product 
on V. Then, for any f,g © V, we have 


(Flo) |? < FIP) glg)- 
Proof. If f =0 or g = 0, then equality holds. Hence suppose that f 4 0 and let 
(f19) 


Zi=—_ EC. 


(FIP) 


7 Also known as the Cauchy-Bunyakovsky-Schwarz inequality in the Russian literature. 


Then, by positive-definiteness of (-|-), we have 


< (zf — glzf — 9) 
= |2\? (Ff) — 2(fla) — 2¢glf) + (alg) 
_ fla)? _ fig) ao 
_ la? 
By rearranging, since (f|f) > 0, we obtain the desired inequality. 0 
Note that, by defining ||f|| := ./(f|f), and using the fact that |(f|g)| > 0, we can write 


the Cauchy-Schwarz inequality as 


(fla) < Iflllgll 
Proposition 3.2. The induced norm on V is a norm. 
Proof. Let f,g Vandz C. Then 
(i) fll = YEA) 20 © 
(ii 


(iii 


fm: 


|f|| =0 = lf]? =0 © (flf)=0 © Ff =0 by positive-definiteness 


lz fll = Vezflef) = V22 (FIP) = Viz? FIP) = lal IF) = Ialil sl 


(iv) Using the fact that z+Z = 2 Rez and Rez < |z| for any z € C and the Cauchy-Schwarz 
inequality, we have 


) 
) 
i) 
) 


f+ gl? = F+alf +9) 

= (f\f) + (fla) + Calf) + (alg) 
= (fIf) + (fla) + Flg) + (alg) 
= ff +2Refg+ 99 


< (f|f) +21 (fio) + (alg) ? 
< (fF) + 2ilFfiillall + (gla) 
= (If ll + Ilgll)?: 


By taking the square root of both sides, we have || f + g|| < || f|| + |lgll. 0 


Hence, we see that any inner product space (i.e. a vector space equipped with a sesqui- 
linear inner product) is automatically a normed space under the induced norm. It is only 
natural to wonder whether the converse also holds, that is, whether every norm is induced 
by some sesqui-linear inner product. Unfortunately, the answer is negative in general. The 
following theorem gives a necessary and sufficient condition for a norm to be induced by a 
sesqui-linear inner product and, in fact, by a unique such. 


—~ 96 — 


Theorem 3.3 (Jordan-von Neumann). Let V be a vector space. A norm || - || on V is 
induced by a sesqui-linear inner product (-|-) on V if, and only if, the parallelogram identity 


If + gl? +f — oll? = 21 F117 + 2ilall? 
holds for all f,g € V, in which case, (-|-) is determined by the polarisation identity 
(flo) = 4 HIF + ig? 
12 


= 4 (\\F + al? — If — ol? + ill —igll? — if + iol?) 
Proof. (=) If || - |] is induced by (-|-), then by direct computation 


f+ gl? + lf — gl? = (f+ olf +9) + (f -9lf -9) 


= (flf) + (fla) + (lf) + (ala) 
+ (fIf) — (fla) — Calf) + (glg) 

= 2(f|f) + 2(glg) 

=: 2I|f|l? + 2llgll?, 


so the parallelogram identity is satisfied. We also have 


f+ gl? — Ff — oll? = + olf +9) -(f -alf -9) 


= (f\f) + (fla) + (olf) + (gla) 
— (f|f) + (fla) + (olf) — (gla) 
= 2(flg) + 2(g|f) 
and 
ill f —ig||? —illf + ig|? = i(f —ig|f —ig) —i(f + ig|f + ig) 
= i(flf) + (fla) — (olf) +itglg) 
iff +f9g gf igg 
= Aflg) > 2(91f)) —( 1) - C1) 
Therefore 


lf + 9ll? — lf — ll? +illf —igll? — allf + igll? = 4(flg). 


that is, the inner product is determined by the polarisation identity. 


(<=) Suppose that || - || satisfies the parallelogram identity. Define (-|-) by 


(Fla) = 5 (If + gl? = If — gl? +illf —igl?? — il + iP). 


We need to check that this satisfies the defining properties of a sesqui-linear inner 
product. 


97 — 


(i) For conjugate symmetry 


(flo) = 4(f + oll? - If — oll? +illf — igll? — illf + igll?) 


= (lf + gl? — lf — gll? —illf —igll? +illf + igll?) 

= Ff + gll? — If — gl? —ill(—GF+ g)IP +illi(if + 9) II’) 

So," 2a feat See ae 

= 4((lo + #1? = Ilo = FIP = illo il + iligl—ifit) I | 
=: (g|f) 


(ii) We will now show linearity in the second argument. This is fairly non-trivial 
and quite lengthy. We will focus on additivity first. We have 


(Fla +h) 2= 3 + 9 + Al? — [Lf — 9 — All? +illf — ig — in|]? —illf + ig + inl). 


Consider the real part of (f|g + h). By successive applications of the parallelo- 
gram identity, we find 


If +g +h? - lf -—9- All?) 
ftgth?+ ftg h? f+g h? f g hk?) 


Re(flg +h) = 4( 
irs of? oye — ory Lapel aygyy I - - A 


| 


| 


2l|f + gll? + 21FI? + 2N]RI/° — 2i.f — All? — 2if1° — aligll’) 

Qf + gllP + IF + AIP + If — PIP — Qf — All? — LF + oll? - If — oll”) 
(lf + all? +f + All? — If — ll? — IF — all?) 

= Re(flg) + Re(f|h). 


al 
4 
1 
t 
4 
a 
4 
dd 
4 
al 
4 


Replacing g and h with —ig and —ih respectively, we obtain 


Im(f|g + h) = Im(flg) + Im(f]h). 


Hence, we have 


e(f|g +h) +ilm(flg + h) 

(Fg) + Re(|h) + i(lm( fl) + Im (f|A)) 
flg) + ilm(f |g) + Re(f|h) + ilm(f|h) 
) 


+ (f\h); 


Re 
Re 


= (fig 


which proves additivity. 


For scaling invariance, we will proceed in several steps. 


(a) First, note that 
(f10) = $C FI? — (FI? +All fll? — ill fll?) = 


and hence (f|0g) = 0(f|g) holds. 


— 98 


(b) Suppose that (f|ng) = n(f\g) for some n € N. Then, by additivity 


(fi(n + 1)g9) = (flng + 9) 
= (f|ng) + (fla) 
=n(flg) + (fla) 
=(n+1) fq. 


Hence, by induction on n with base case (a), ‘vd ies 


VneN: (fing) =n(f\g)- 


(c) Note that by additivity 


(flg) + (fl-9) = (fla — 9) = (flo) © 0. 


Hence (f|—g) = —(flg). 
(d) Then, for anyn €N 


(c) 


(f|—ng) 2 -(f|\ng) @ —nlF lg) 


and thus 
VneEZ: (fing) =n(f\g). 


(ec) Now note that for any m € Z \ {0} 


m(fl4g) 2 (flm4.g) = (fg) 


and hence, by dividing by m, we have (f|4g) = +(f\g). 


(f) Therefore, for anyr = % € Q, we have 


(fira) = (49) 2 n(flbg) 2 2 (fla) =r (Flg) 


and hence 
Vr EQ: (flrg)=r(flg). 


(g) Before we turn to R, we need to show that |(flg)| < 2\|f'||Ig||. Note that 
here we cannot invoke the Cauchy-Schwarz nea (which would also 
provide a better estimate) since we don’t know that (-|-) is an inner product 
yet. First, consider the real part of (f|g). 


Re(f |g) = 4(lf + gll? - lf - oll?) 

(2\|f + gl? — If + ll? — If - 91?) 
(2\| f + gll? — 2I1F 11? - 2llg|l?) 
(2 
( 


| 


Be Bl BR BF Bl 


| 


(Ill + Ilgll)? — 2i1f1? — 2ligll*) 
if IP + Allflillgll + 2llgll? — 2IFI? — 2llgll*) 
= Ilflillgl- 


I IA Il 


—~ 99 — 


Replacing g with —ig and noting that ||— ig|| = | —il||g|| = ||g||, we also have 
Im(f|g) < |Ifllllg|l- 
Hence, we find 


(fl9)| = | Re(flg) + ilm(f]g)| 
= (Re 2+ (Im 2 
< Wilsliligl)? + (IFllligll)? 
= V2 KI f\lllgll 
(h) Let r € R. Since R is the completion of Q (equivalently, Q is dense in k), 


there exists a sequence {rp,}nen in Q which converges to r. Let ¢ > 0. Then, 
there exist N,, No €N such that 


Vn >Ny: Irn — 


|< VFI 
Vaym> No? |\t_y Tel < 2Walfilall 


Let N := max{ Nj, No} and fixm > N. Then, for alln > N, we have 


Irn(flg) — (Flrg)| = Irn(flg) — tm {Flg) + rm( Fla) — (flrg)| 
= |rn(fl9) — rm Fla) + (Ff lrmg) — (Flrg)| 
(Tr — Tm) Flg) + (FI(tm — 7)9)| 


<tr — tm) (Fla) + [Fl rm — 7) 9)| 
g) 


< V2lrn — rmlllflillgll + V2 II Gm — r)all 
= VIlrn — Pall Filial + V2lrm — rilPiliial 


Vv2—-—___— V2 
ve AG I I FIlllgll + CAA I FIl{lgll 


—~ 
eH 
Nae 


—m 


thatis, lim rn(flg) = (flrg). 
(i) Hence, for any r €R, we have 
r(flg) = (Jim rn) (fla) = lim ra (flg) © (Fir) 


and thus 
Vr eR: r(flg) = (f\rg). 
(j) We now note that 
(flig) = 4(lf + igll? — If — igll? + illf —i?gll? - ill f+ gl?) 
= di(-illf + igll? + llf —igll? +illf + oll? - If - oI?) 
: i(flg) 


and hence (f|ig) = i(f|g). 


— 30 -— 


(k) Let z €C. By additivity, we have 

(flzg) = (f|(Rez + ilmz)g) 
(f|(Re z)g) + (f]i(im z)g) 
(f|(Re z)g) +i(f|(m z)g) 
© Rez fg +ilmz fg 
= (Ref + imz)(f|g) | ? 
= 2(f lg), 


which shows scaling invariance in the second argument. 


= ll 


j) 


| 


Combining additivity and scaling invariance in the second argument yields lin- 
earity in the second argument. 


(iii) For positive-definiteness 
(fF) = NF + FI? — If — FIP + illt — afl? —illf + if i?) 
= 44 FP + i]1 — i[?7 FI? — if + iP FIP) 
= 3(4+i]1 — if? —i]1 +i?) IFIP 
= 4(4 + 2i — 2i) IF? 
= |[fIl?. 
Thus, (f|f) >Oand (f/f) =0 = f=0. 


Hence, (-|-) is indeed a sesqui-linear inner product. Note that, from part (iii) above, 
we have 


(FIf) = Ifil- 


That is, the inner product (-|-) does induce the norm from which we started, and this 
completes the proof. O 


Remark 3.4. Our proof of linearity is based on the hints given in Section 6.1, Exercise 
27, from Linear Algebra (4th Edition) by Friedberg, Insel, Spence. Other proofs of the 
Jordan-von Neumann theorem can be found in 


EAT MulbpRe oy, AAA RE tte cdl sotte Bf 1ggqrator Algebras: Volume I: 


e Kutateladze, Fundamentals of Functional Analysis, Springer 1996. 


e Day, Normed Linear Spaces, Springer 1973. 


Remark 3.5. Note that, often in the more mathematical literature, a sesqui-linear inner 
product is defined to be linear in the first argument rather than the second. In that case, 
the polarisation identity takes the form 


k=0 


= 2i'= 


Example 3.6. Consider CQ(0, 1] and let f(x) =a and g(x) = 1. Then 


Ifllo=1,  [Igllo=1, If +gllo=2, If —-gllo=1 


and hence 
f+ allo + IF -— 912 = 544 = 211 F115 + 2llgll- 


Thus, by the Jordan-von Neumann theorem, there is no inner product on C®[0,1] which 
induces the supremum norm. Therefore, (C20, 1], || - ||.o) cannot be a Hilbert space. 


Proposition 3.7. Let H be a Hilbert space. Then, H* is a Hilbert space. 


Proof. We already know that H* := L(H,C) is a Banach space. The norm on H* is just 
the usual operator norm 
If 


I fllae = sup ——— 

ven |lvllx 
where, admittedly somewhat perversely, we have reversed our previous notation for the dual 
elements. Since the modulus is induced by the standard inner product on C, i.e. |z| = V2zz, 


it satisfies the parallelogram identity. Hence, we have 


; ; (ftgy) 7 (f gy) ? 
f+ gl + If -— gl" = (pawl llellx |. + ,sem| Helle | 
_ (+o? ),(lZ-90P | 
pEH ales pEH Ie l3, 
Sup If(v) + ov)? + IF (¢) — 9(v)/? 
2 
wen olla, 
2 2If(~)|? + 2lg(~)|? 
= sup —— _ > 
pEH ella, 
— oun LRP 4 yen WL 
= eM Hl, * ack Mell 


= 2llfllz- + 2llalle- 


where several steps are justified by the fact that the quantities involved are non-negative. 


Hence, by the Jordan-von Neumann theorem, the inner product on H* defined by the 
polarisation identity induces || - ||z;*. Hence, H* is a Hilbert space. 


The following useful fact is an immediate application of the Cauchy-Schwarz inequality. 
Proposition 3.8. Inner products on a vector space are sequentially continuous. 


Proof. Let (-|-) be an inner product on V. Fix y € V and let lim Un = vw. Then 
NOOO 


elon) — (lb) = Kelen — )I 
S |lelllleon — 


and hence lim (y|¥n) = (yl). 0 


Rey) me 


3.2 Hamel versus Schauder® 


Choosing a basis on a vector space is normally regarded as mathematically inelegant. The 
reason for this is that most statements about vector spaces are much clearer and, we main- 
tain, aesthetically pleasing when expressed without making reference to a basis. However, 
in addition to the fact that some statements are more easily and usefully written in terms 
of a basis, bases provide a convenient way to specify the elements of a vector space in terms 
of components. The notion of basis for a vector space that you most probably met in your 
linear algebra course is more properly know as Hamel basis. 


Definition. A Hamel basis of a vector space V is a subset B C V such that 


(i) any finite subset {e1,...,é€n} C B is linearly independent, i.e. 
Us . 
AGU = Lh Sera a9 
i=l 


(ii) the set B is a generating (or spanning) set for V. That is, for any element vu € V, 
there exist a finite subset {e1,...,e,} C B and A!,...,A" € C such that 


n 


v= Arter. 
Mel 


Equivalently, by defining the linear span of a subset U C V as 


n 
span U := oS ru, 
i=1 


i.e. the set of all finite linear combinations of elements of U with complex coefficients, 


MoM EC) thy ytm EU andn > 1}, 


we can restate this condition simply as V = span B. 


Given a basis B, one can show that for each v € V the \!,...,A” appearing in (ii) 
above are uniquely determined. They are called the components of v with respect to B. 

One can also show that if a vector space admits a finite Hamel basis B, then any other 
basis of V is also finite and, in fact, of the same cardinality as 


B 


Definition. If a vector space V admits a finite Hamel basis, then it is said to be finite- 
dimensional and its dimension is dimV := |B]. Otherwise, it is said to be infinite- 
dimensional and we write dim V = ov. 


Theorem 3.9. Every vector space admits a Hamel basis. 


For a proof of (a slightly more general version of) this theorem, we refer the interested 
reader to Dr Schuller’s Lectures on the Geometric Anatomy of Theoretical Physics. 


Note that the proof that every vector space admits a Hamel basis relies on the axiom 
of choice and, hence, it is non-constructive. By a corollary to Baire’s category theorem, a 


8 Not a boxing competition. xP 


= 335 — 


Hamel basis on a Banach space is either finite or uncountably infinite. Thus, while every 
Banach space admits a Hamel basis, such bases on infinite-dimensional Banach spaces are 
difficult to construct explicitly and, hence, not terribly useful to express vectors in terms 
of components and perform computations. Thankfully, we can use the extra structure of a 
Banach space to define a more useful type of basis. 


Definitions bet AWalhs|l} ke pBenach space. Awe — of WM isa sequence tata” 
= = lim > Ne; = = 


i=0 


or, by explicitly using the definition of limit in W, 


We note the following points. 
e Since Schauder bases require a notion of convergence, they can only be defined on 


a vechon Pag oh eauipne ad with a (compatible) topological structure, of which Banach 


e Unlike Hamel bases, Schauder bases need not exist. 


e Since the convergence of a series may depend on the order of its terms, Schauder bases 
must be considered as ordered bases. Hence, two Schauder bases that merely differ 
in the ordering of their elements are different bases, and permuting the elements of a 
Schauder basis doesn’t necessarily yield another Schauder basis. 


e The uniqueness requirement in the definition immediately implies that the zero vector 
cannot be an element of a Schauder basis. 


e Schauder bases satisfy a stronger linear independence property than Hamel bases, 
namely 


oo . . 
Src, = 0 => VieN: A =0. 
4=0 


e At the same time, they satisfy a weaker spanning condition. Rather than the linear 
span of the basis being equal to W, we only have that it is dense in W. Equivalently, 


W =span{e, |n € N}, 
where the topological closure U of a subset U C W is defined as 


U := { lim Un |VWn EN: un EU}. 
noo 


— 34 — 


Definition. A Schauder basis {en}nen of (W, |] - ||) is said to be normalised if 
YVneéEN: |le,|| = 1. 


Multiplying an element of a Schauder basis by a complex number gives again a Schauder 
basis (not the same one, of course). Since Schauder bases do not contain the zero vector, 
any Schauder basis en nen gives rise to a normalised Schauder basis en nen by defining 

{ } a } 


Te 


3.3 Separable Hilbert spaces 


Separability is a topological property. A topological space is said to be separable if it 
contains a dense subset which is also countable. A Banach space is said to be separable 
if it is separable as a topological space with the topology induced by the norm. Similarly, 
a Hilbert space is said to be separable if it is separable as a topological space with the 
topology induced by the norm induced in turn by the inner product. 

For infinite-dimensional Hilbert spaces, there is a much more useful characterisation of 
separability, which we will henceforth take as our definition. 


FERDO TOP Bra dl SURUE GUNES RA MOP SRE BaP: Ye sted Ody th 
1 aieg 

0 ifitzg 

Whether this holds for Banach spaces or not was a famous open problem in 


functional analysis, problem 153 from the Scottish book. It was solved in 1972, © 
more that three decades after it was first posed, when Swedish mathematician 


Vi,g EIN: (exlej) = dij = 


Enflo constructed an infinite-dimensional separable Banach space which lacks i 
a Schauder basis. That same year, he was awarded a live goose? for his effort. - 
Remark 3.11. The Kronecker symbol 6;; appearing above does not represent the compo- 
nents of the identity map on. _ Instead, 6;; are the components of the sesqui-linear form 
(-|-), which is amap H x H -HC, unlike idy which isa map H — H. If not immediately 
understood, this remark may be safely ignored. 


Remark 3.12. In finite-dimensions, since every vector space admits (by definition) a finite 
Hamel basis, every inner product space admits an orthonormal basis by the Gram-Schmidt 
orthonormalisation process. 


From now on, we will only consider orthonormal Schauder bases, sometimes also called 
Hilbert bases, and just call them bases. 


Lemma 3.13. Let H be a Hilbert space with basis {en}nen. The unique sequence in the 
expansion of » € H in terms of this basis is {(en|W) }nen. 


*https ://en.wikipedia. org/wiki/Per_Enflo#Basis_problem_of_Banach 


— 35 -— 


Proof. By using the continuity of the inner product, we have 


- * 
S- wes) 
j=0 
| n 


= e; lim A 6s 
noo 


fm (« ve) 
—0o 
j=0 
e . 
= ERD Mo 
tt 


(ee) 
=> Oy 
i=0 


=! 


(eil) = (« 


| 


which is what we wanted. oO 


While we have already used the term orthonormal, let us note that this means both 
orthogonal and normalised. Two vectors are said to be orthogonal if ' 
and a subset of H is called orthogonal if i@dtemdlits are pairwise orthogonal. (ply) = 0 


Lemma 3.14 (Pythagoras’ theorem). Let H be a Hilbert space and let {w0,...,Un} CH 
be a finite orthogonal set. Then 


n 


2 
So vi — 


1=0 


Yo lvl? 
i=0 


Proof. Using the pairwise orthogonality of {vo,...,%n}, we simply calculate 


2 n n n n 
= $3 - Yu) = ales) + Wecbabs) = oh? 
1i=0 j=0 41=0 41=0 


al 


Corollary 3.15. Letw €H and let {er}n€N be a basis of H. Then 
CO 
Il? = So Meal)? 
i=0 


Proof. By continuity of the norm, we have 


n 


2 
S_(eilp)ei 
i=0 


n 


= lim S7 (eid) llei|? = 3 (e;Ww)/? Oo 


41=0 


pl? = Lim 


3.4 Unitary maps 


An insightful way to study a structure in mathematics is to consider maps between different 
instances A, B, C,... of that structure, and especially the structure-preserving maps. If 


— 86 -— 


a certain structure-preserving map A — B is invertible and its inverse is also structure- 
preserving, then both these maps are generically called isomorphisms and A and B are 
said to be isomorphic instances of that structure. Isomorphic instances of a structure are 
essentially the same instance of that structure, just dressed up in different ways. Typically, 
there are infinitely many concrete instances of any given structure. The highest form of 
understanding of a structure that we can hope to achieve is that of a classification of its 
instances up to isomorphism. That is, we would like to know how many different, non- 
isomorphic instances of a given structure there are. 

In linear algebra, the structure of interest is that of vector space over some field F. The 
structure-preserving maps are just the linear maps and the isomorphisms are the linear 
bijections (whose inverses are automatically linear). Finite-dimensional vector spaces over 
F are completely classified by their dimension, i.e. there is essentially only one vector space 
over F for each n € N, and F” is everyone’s favourite. Assuming the axiom of choice, 
infinite-dimensional vector spaces over F are classified in the same way, namely, there is, 
up to linear isomorphism, only one vector space over F for each infinite cardinal. 

Of course, one could do better and also classify the base fields themselves. The classi- 
fication of finite fields (i.e. fields with a finite number of elements) was achieved in 1893 by 
Moore, who proved that the order (i.e. cardinality) of a finite field is necessarily a power of 


SONS PUPMOHUE pkrAD Pb cig ds PPAY ONS Bathe Held ebsach aiden ak, the BRRHOPriate 


A classification with far-reaching implications in physics is that of finite-dimensional, 
semi-simple, complex Lie algebras, which is discussed in some detail in Dr Schuller’s Lectures 
on the Geometric Anatomy of Theoretical Physics. 

The structure-preserving maps between Hilbert spaces are those that preserve both the 
vector space structure and the inner product. The Hilbert space isomorphisms are called 
unitary maps. 


Definition. Let H and G be Hilbert spaces. A bounded bijection U € L(H,G) is called a 
unitary map (or unitary operator) if 

Vp, pEeH: UpUy)g = Wly)n- 
If there exists a unitary map , then and _ aresaid to be unitarily equivalent and 
we write H =Hil G. H > G H G 


There are a number of equivalent definitions of unitary maps (we will later see one 
involving adjoints) and, in fact, our definition is fairly redundant. 


Proposition 3.16. LetU: H >G be a surjective map which preserves the inner product. 
Then, U is a unitary map. 


Proof. (i) First, let us check that U is linear. Let w,p € H and z €C. Then 
|U (eb + 9) — Ub — Vellg = (U (ab + pe) — Uy — Uy|U (ep + 9) — Ud — Ue)g 
= U(zb + )|Uleb + v))g + |eP(UYUv)g + Uy|Uy)g 
— 2{U (2 + 9)|Ud)g — Ullah + 9) |Up)g + ZUY|UY)g 
— ZU Y|U (zp + p))g — Up|U (2b + ))g + zUY|UY)g 


—~ 37 -— 


= (2) + glad + vu + lzl?(Wlh)o0 + (ely) 

— 22h + yld)y — (2b + vle)n + 2H y) 4 

—2plzp + pu — (vleb + pnt 2(elb)u 
= Nz? bled) + Zdle)n + z(olb)n + 2(vly) 2 

— |z|? (|b) ae — 2(old) a — Bly) 74 


— [2/2 (|W) H — 2tebly)ye — 2(p|h)yH 
=: 


— (ple)u + 2p) x 
— (pIp)# + 2(p|p)H 


= =: 


Hence ||U(zw + y) — zUw — Ugy||g = 0, and thus 
U(zy + yp) =2Uv + Uy. 


(ii) For boundedness, simply note that since for any w € H 


Weblo = VOY d)g = V lb) = [ll 


Wvlle 
yew Mlle 


we have 


— a ee a 


Hence U is bounded and, in fact, has unit operator norm. 


(iii) Finally, recall that a linear map is injective if, and only if, its kernel is trivial. Suppose 
that ~ € ker U. Then, we have 


(b|b)x = (UY|UY)g = (0|0)g = 0. 


Hence, by positive-definiteness, ~ = 0 and thus, U is injective. Since U is also 
surjective by assumption, it satisfies our definition of unitary map. O 


Note that amap U: H > G is called an isometry if 


VpeEH: |Uvllo = lela. 


Linear isometries are, of course, the structure-preserving maps between normed spaces. We 
have shown that every unitary map is an isometry has unit operator norm, hence the name 
unitary operator. 


Example 3.17. Consider the set of all square-summable complex sequences 


02(IN) = fa: NC | 3 lal? < oo} 


i=0 
We define addition and scalar multiplication of sequences termwise, that is, for all n € IN 
and all complex numbers z € C, 


(a+b)n = An + bn 


(290) 4220». 


— 388 -— 


These are, of course, just the discrete analogues of pointwise addition and scalar multipli- 
cation of maps. The triangle inequality and homogeneity of the modulus, together with the 
vector space structure of C, imply that (¢7(IN),+,-) is a complex vector space. 

The standard inner product on €7(IN) is 


ce <= 


This inner product induces the norm 


llalle = V (ala) ea = 


with respect to which £2(N) is complete. Hence, (¢?(IN), +, -, (-|-)p2) is a Hilbert space. 
Consider the sequence of sequences {€n}nen where 


eo = (1,0,0,0,...) 
e, = (0,1,0,0,...) 
eo = (0,0, 1,0,...) 


i.e. we have (€n)m = Onm-. Each a € £2(IN) can be written uniquely as 


CO 
= i 
a= ) r'€;, 


i=0 
where A* = (e;|a) 2 = a;. The sequences ey, are clearly square-summable and, in fact, they 
are orthonormal with respect to (-|-) 92 


CO 


(€n|€m) 2 = (€n);(€m)i = S- Onion = Onine 
1=0 1=0 


Hence, the sequence {en}nen is an orthonormal Schauder basis of ¢?(IN), which is 


therefore an infinite-dimensional separable Hilbert space. 
Theorem 3.18 (Classification of separable Hilbert spaces). Every infinite-dimensional sep- 
arable Hilbert space is unitarily equivalent to ¢?(\N). 


Proof. Let H be a separable Hilbert space with basis {e,}nen. Consider the map 
U: H — £7(IN) 
yr {(en|Y) nH }nen: 


Note that, for any w € H, the sequence {(e,,|¢)41}nen is indeed square-summable since we 
have 


do Keil)? = [thle < 20, 


1=0 


— 39 — 


By our previous proposition, in order to show that U is a unitary map, it suffices to 
show that it is surjective and preserves the inner product. For surjectivity, let {an}nen 
be a complex square-ssummable sequence. Then, by elementary analysis, we know that 
Jim |an| = 0. This implies that, for any ¢ > 0, there exists N € N such that 


n 
2 


YVn>m>N: lat] <e. 


Then, for alln,m > N (without loss of generality, assume n > m), we have 


S ajei— Sage; = S- ajye;|| = S- ail? le,l|2, = Se |a;|? <e. 
i=0 j=0 H i=m+1 Hi j=m+1 i=m+1 


That is, 155 axe; } cn 18 @ Cauchy sequence in H. Hence, by completeness, there exists 


w EH such that 
CO 
b= >) aes 
i=0 


and we have Uy = {dan}nen, so U is surjective. Moreover, we have 


Wien = mae ee 

= SoYd- (eile) az(ejle)n(esles)ne 
5-070 

= SOYS- (eile) a(ejle)15ij 
j= 

= S = (eile) az (eile) 
7=0 

= ({(enld)a}nen|{(enl¢)a}nen) 2 

=: Uy|U ey). 


Hence, U preserves the inner product, and it is therefore a unitary map. 


—~ AQ — 


4 Projectors, bras and kets 


4.1 Projectors 
Projectors play a key role in quantum theory, as you can see from Axioms 3 and 5. 


Definition. Let H be a separable Hilbert space. Fix a unit vector e € H (that is, |/e|] = 1) 


and let w . The projection of w to e is 
e da = (eld)e 
while the orthogonal complement of w is 
vi :=p- yn. 


We can extend these definitions to a countable orthonormal subset {e;}ienw C H, i.e. a 
subset of H whose elements are pairwise orthogonal and have unit norm. Note that {e; bien 


need not be a basis of H. 


Proposition 4.1. Letw €H and let {e;}ien CH be an orthonormal subset. Then 


(a) we can write w = wy + w_, where 


wm:i=  (eldje, v= o-d 
i=0 
and we have 
ye EN: (pile: = 0 


(b) Pythagoras’ theorem holds: 
eb |? = [eeu]? + Ub. 


Note that this is an extension to the finite-dimensional case. 
(c) for any y € span{e; |i € N}, we have the estimate 
Ilo — all 2 [oll 


with equality if, and only if, y = Wu. 
Proof. First consider the case of a finite orthonormal subset {e9,..., 


+ yl = w and 


Gn} CH. 
(a) Let ~" and ~ be defined as in the proposition. Then w! 


n 
ci) 


(Wales) = (0 — Yleilbrey 
= (Wle;) - (e;|) (e;|e:) 
j=0 


j=0 


= (~lei) — 2 (ple5)6 jt 


for allO <i <n. 


—Al— 


(b) From part (a), we have 


YKeilb)er) = S (eal) (bles) =0 


i=0 1=0 


(wi |v) = (va 


Hence, by (the finite-dimensional) Pythagoras’ theorem 
2 2 2 2 
Pl =e +A = [| + AT - 
(c) Let y € span{e; |0 <i <n}. Then y = )3y_, Ye; for some Yo0,...,%n € C. Hence 
eb — YIP = Ie + i yl? 


=|ri+y desler Def 


1=0 
n 2 
= vn + 2Celv) - We 
i=0 
n 
= Will? + eslh) — vl? 
i=0 
andthus ww ¥ w, since e;bW y%7 OforallO %  n. Moreover, we 


have equallityf, Ana dnly lif, |(e;;d) + >; 0 forallO <i <q thatisy=y. 
To extend this to a countably infinite orthonormal set {e;};en, note that by part (b) 


and Bessel’s inequality, we have 


n 


S- (eile 


-¥I (eslb)l? < I ll?. 
i=0 


Since |(e;|~)|? > 0, the sequence of partial sums {577, |(ei|¥)| ee 
increasing and bounded from above by ||~||. Hence, it converges and this implies that 


is monotonically 


Co 


d= > (eilbyes 


i=0 


exists as an elpment t of H. The extension to the countably infinite case then follows by 
continuity of the inner product. 


4.2 Closed linear subspaces 


We will often be interested in looking at linear subspaces of a Hilbert space H, i.e. subsets 
M CH such that 
VapeM:VzeEC: zv+yemM. 


Note that while every linear subspace M C H inherits the inner product on H to become 
an inner product space, it may fail to be complete with respect to this inner product. In 
other words, not every linear subspace of a Hilbert space is necessarily a sub-Hilbert space. 

The following definitions are with respect to the norm topology on a normed space and 
can, of course, be given more generally on an arbitrary topological space. 


—A2— 


Definition. Let H be a normed space. A subset M C H is said to be open if 
VWeM:4dr>0:VepeEH: |y—-gll<r> vem. 


Equivalently, by defining the open ball of radius r > 0 and centre w € H 


Br(b):= pep <r, 
LSA = iF 


we can define M C H to be open if 
VpeM:sr>0:B,(~) CM. 
Definition. A subset M C H is said to be closed if its complement H \ M is open. 


Proposition 4.2. A closed subset M of a complete normed space H is complete. 


Proof. Let {Wn}nen be a Cauchy sequence in the closed subset M. Then, {Wn}new is also 

a Cauchy sequence in H, and hence it converges to some w € H since H is complete. We 

want to show that, in fact, ~ € M. Suppose, for the sake of contradiction, that ~ ¢ M, 

i.e.W €H\M. Since M is closed, H \ M is open. Hence, there exists r > 0 such that 
VepEH: |p—vl<r> veH\M. 


However, since w is the limit of {wp}nen, there exists N € IN such that 
Yn >N: |\vn—Yl| <r. 


Hence, for all n > N, we have vy, € H\ M, i.e. Un € M, contradicting the fact that 
{tn }nen is a sequence in M. Thus, we must have 7 € M. O 


Corollary 4.3. A closed linear subspace M of a Hilbert space H is a sub-Hilbert space with 
the inner product on H. Moreover, if H is separable, then so is M. 


Knowing that a linear subspace of a Hilbert space is, in fact, a sub-Hilbert space can 
be very useful. For instance, we know that there exists an orthonormal basis for the linear 


subspace. Note that the converse to the corollary does not hold: a sub-Hilbert space need 
not be a closed linear subspace. 


4.3 Orthogonal projections 
Definition. Let M C H be a (not necessarily closed) linear subspace of H. The set 


M*:={pEeH|VeEeM: (pl) = 0} 
is called the orthogonal complement of M in H. 


Proposition 4.4. Let M C H be a linear subspace of H. Then, M* is a closed linear 
subspace of H. 


As 


Proof. Let 1,2 € M+ and z €C. Then, for all py € M 


(ylzvi + 2) = z(pldr) + (vlP2) = 0 


and hence 271 + W2 € M+. Thus, M+ is a linear subspace of H. It remains to be shown 
that it is also closed. Define the maps 


fe: HOC 
yr (gly). 


Then, we can write 
iL pa . 
M~ = () preim > ({0}). 
peM 
Since the inner product is continuous (in each slot), the maps f,, are continuous. Hence, the 
pre-images of closed sets are closed. As the singleton {0} is closed in the standard topology 
on C, the sets preim fo LO}) are closed for all y € M. Thus, M+ is closed since arbitrary 
intersections of closed sets are closed. O 


Remark 4.5. Note that M+ is also not open (which is not necessarily the same as closed). 
To clarify, equally the preimage of 0 is anopen set (as 0 is open), however it is only 
finite intersections of open sets thatlave open. So the incluston of the intersection plays an 
important role. 


Note that by Pythagoras’ theorem, we have the decomposition 
H=MeM?+ :={~+yo|vpeM,pe M+} 
for any closed linear subspace M. 
Proposition 4.6. For any closed linear subspace X C H its true that Xt+ =X. 


Proof. Let « € X, then for all y € X+ we have (x,y) = 0 andso x € X++. This gives 
x exX 

Now consider z X++. As X is closed from the above note we know it can be decomposed 
asz=a2+y fore « X andy € X1. We then have 


0 = (y, 2) 
= (y+ Y) 
= (Ys 2) + (YY) 
= |ly\l? 
=> y=0 
where the last step comes from the definiteness of ||-||. So we have z € X and Xt+cC X. O 


Proposition 4.7. For any linear subspace M C H it is true that M++ = M, where the 
latter is the topological closure of the set. 


—~44~— 


Proof. We start with two observations: 


(i) M C M+4, which was shown at the start of the last proof (as there was no use of 
the fact that X was closed there. 


(ii) If My C Mg then M4 @ Me, which can be shown easily. 


First let’s show that M11 C M_Clearly M C M (where the equality holdonly if MLis 
closed), and so from (ii) we have MC M+, which in turn gives M7* CM. But M 


is closed and so from the previous proposition we have M1+ ¢ M. 
Now we need to show the reverse inclusion. From Proposition 4.4 we know that M++ 
is closed. Then (i) instantly tells us that MC M14. Oo 


Definition. Let M be a closed linear subspace of a separable Hilbert space H and fix some 
orthonormal basis of M. The map 
Pu: HM 
youn 


is called the orthogonal projector to M. 
Proposition 4.8. Let Py: H > M be an orthogonal projector to M CH. Then 
(i) Pu oPm = Pm, sometimes also written as P?,; = Pau 
(i) VbpEeH: (Pudly) = (b|/Puy) 
(iti) Py = do 
(iv) Py € L(H,M). 


Proof. Let {e;};cr and {e;};c7 be bases of M and M+ respectively, where I, J are disjoint 
and either finite or countably infinite, such that {e;};eruz is a basis of H (Note that we 
should think of J UJ as having a definite ordering). 


(i) Let w . Then 
EH 


Paa(Paaw) = Pau ( S(eiid)er] 


= (al Slee )e 


gel 
= 7 eeilv)(ejleides 
jel iel 
= S(eilvei 
iel 
=) Paw: 


Sy 


(ii) Let ,p € H. Then 
Pei (Sevye 


*) 
iel 


S— (eal) (eal) 


tel 
Lleileles) 


v 
iel 


=: (~|Puy). 


| 


| 


(ely) ble’) 


| 


(iii) Let €H. Then 
PuvtPryiv= > (eilvoert > edlvoer= So (eslbjer =v. 


i€l iE J i€IUS 
Hence 


Pusv=)—-Pmupa=d-m= 1. 
(iv) Let yw € H. Then, by Pythagoras’ theorem, 


PmMw = Ji ill <] 0 
vex [el vex idl vex lel - 


Quite interesting, and heavily used, is the converse. 
Theorem 4.9. Let P © L(H,H) have the properties 
(i) PoP=P 
(W)VY,p EH: (Pde) = WIP ¢). 
Then, the range P(H) of P is closed and 
P = Ppay. 


In other words, every projector is the orthogonal projector to some closed linear subspace. 


4.4 Riesz representation theorem, bras and kets 


Let H be a Hilbert space. Consider again the map 
feed Lb 
pr (ply). 


for py € H. The linearity in the second argument of the inner product implies that this map 
is linear. Moreover, by the Cauchy-Schwarz inequality, we have 


fo) ob) Ill) — 9) 
sop el ee re ee eT els 


Hence, f, € L(H,C) =: H*. Therefore, to every element of y of H, there is associated an 
element f, in the dual space H*. In fact, the converse is also true. 


~ 4G — 


Theorem 4.10 (Riesz representation). Every f € H* is of the form f, for a unique py € H. 


Proof. First, suppose that f = 0, i.e. f is the zero functional on H. Then, clearly, f = fo 
with 0 € H. Hence, suppose that f # 0. Since, ker f := preim,({0}) is a closed linear 
subspace, we can write 


H =ker f @ (ker f)+. 


As f # 0, there exists some w € H such that ~ ¢ kerf. Hence, ker f #4 H, and thus 
(ker f)+ A {0}. Let € € (ker f)+ \ {0} and assume, w.l.o.g., that: ||€|| = 1. Define 


p= FE € (ker f)". 


Then, for any ~ € H, we have 


= (F(E)Ele) — Fb) (E18) 
(E1F(S)%) — EIF(W)E) 
= (ElF(E)b — FW)E) 


Note that 
FFE) — FO)E) = HOF) — FW) F(E) = 9, 
that is, f (©)v — f(W)é € ker f. Since € € (ker f)+, we have 


(E1F(E)b — F(W)E) = 0 


and hence f,.(w) = f(w) for all » CH, i.e. f = f,. For uniqueness, suppose that 


f=fe = fips 


for some 1, 91 € H. Then, for any w € H, 


0= fa _ fool) 


=vib gry 
= (v1|—)p2l%) | ) 
and hence, ~1 = 2 by positive-definiteness. LJ 
Therefore, the so-called Riesz map 
R:H7>H* 
pr fo = (¢l-) 


is a linear isomorphism, and H and H* be identified with one another as vector spaces. 
This lead Dirac to suggest the following notation for the elements of the dual space 


= (y|. 


Say 2 


Correspondingly, he wrote |w) for the element w € H. Since (-|-) is “a bracket”, the first 
half (-| is called a bra, while the second half |-) is called a ket (nobody knows where the 
missing c is). With this notation, we have 


folv) = (vv) = (ely). 


eM pLaa VA EMS HP chat Pap sRREY Pee #-yve Gen always consider the 


The advantage of this notation is that some formule become more intuitive and hence 
are more easily memorised. For a concrete example, consider 


CO 


b= > (elves 


i=0 
where {e€n}nen is a basis of H. This becomes 


[o.@) 


\w) = S$“ (eile) Iles). 


i=0 
By allowing the scalar multiplication of kets also from the right, defined to yield the same 
result as that on the left, we have 
= )_ les) (eal). 
i=0 


“Quite obviously”, we can bracket this as 
CO 

= (oleate) 
i=0 


where by “quite obviously”, we mean that we have a suppressed tensor product (see section 
8 of the Lectures on the Geometric Anatomy of Theoretical Physics for more details on 
tensors) 


- (2, lei) ® “ |b). 


Then, the sum in the round brackets is an element of H ® H*. While H ® H* is isomorphic 
to End(H), its elements are maps H* x H — C. Hence, one needs to either make this 
isomorphism explicit or, equivalently, 


CO 


= (Sole) 8 el) (-.10) 


i=0 
All of this to be able to write - 
S lei) (esl = idyy 
i=0 


ete 


and hence interpret the expansion of |~) in terms of the basis as the “insertion” of an identity 


pie (>: jade) 1) = eit) |e. 
1=0 4=0 


But the original expression was already clear in the first place, without the need to add 


Pide Sh EARP NCH AE OSHS HRETRE,) RATE ARE st he AP RPAL OF Eis notation! 
that the bra on the right acts on a ket in H, thereby producing a complex number which 
becomes the coefficient of the remaining ket 


(Jez) (esl) |) = lez) (eslb) = (ex|tb) Jes). 


The major drawback of this notation, and the reason why we will not adopt it, is that 
in many places (for instance, if we consider self-adjoint operators, or Hermitian operators) 
this notation doesn’t produce inconsistencies only if certain conditions are satisfied. While 
these conditions will indeed be satisfied most of times, it becomes extremely confusing to 
formulate conditions on our objects by using a notation that only makes sense if the objects 
already satisfy conditions. 


necesthlP UE pe Abie PAAR Pendy used da pasics ang relaced appledisctencesniine 


things clearer. If anything, it makes things more complicated. 


—~ 49 — 


5 Measure theory 


This and the next section will be a short recap of basic notions from measure theory and 
Lebesgue integration. These are inescapable subjects if one wants to understand quantum 
mechanics since 


(i) the spectral theorem requires the notion of (projection-valued) measures 


(ii) the most commonly emerging separable Hilbert space in quantum mechanics is the 
space L?(R®), whose definition needs the notion of Lebesgue integral. 


5.1 General measure spaces and basic results 


Definition. Let M be a non-empty set. A collection © C A(M) of subsets of M is called 
ao-algebra for M if the following conditions are satisfied 


(i) Mex 
(ii) ifA ex ,thennM\AEd 


(iii) for any sequence {Ay }new in U we have 72) An €X. 


Remearkb—+. If we relax the third condition aes it applies only to finite (rather than 
countable) unions, we obtain the notion of artValgebra, often called an algebra of sets in 


order to distinguish it from the notion of algebra as a vector space equipped with a bilinear 
product, with which it has nothing to do. 


Remark 5.2. Note that by condition (ii) and De Morgan’s laws, condition (iii) can be equiv- 
alently stated in terms of intersections rather than unions. Recall that De Morgan’s laws 
“interchange” unions with intersections and vice-versa under the complement operation. 
That is, if M is aset and {A;},e7 is a collection of sets, then 


M\ (U4) = (AD, (4) = Ueno 


A o-algebra is closed under countably infinite unions (by definition) but also under 
countably infinite intersections and finite unions and intersections. 


Proposition 5.3. Let M be a set and let & be a o-algebra on M. Let {An}nen be a 
sequence in &. Then, for all k EIN, we have 


(i) Un=0 4n €2 
(ti) Neg An €E and (Ky An €E. 
Proof. (i) Let the sequence {B,}nen be defined as follows: 


A, if O<n<k 
So ifn>k. 


—50O- 


Then, {Bn}nen is a sequence in 5, so UP?) Bn € U. Hence, we have: 
oe) k oo k 
Uen= (Ue )U( U &)=UAn 
n=0 n=0 n=0 


n=k-+1 
and thus ae x 


ii) As {An }ilcly is a sequence in 4, so is {M \ An }nen and hence | [°° ,(M \ A,) € &. 
ren Un 0 
Thus, we also have 


M\(U O4\ an) €2 


n=0 
and since M \ (M \ A,,) = An, by De Morgan’s laws, (\72_) An € U. That this holds 
for finite intersections is shown by defining {B,}nen as above. 0 


Definition. A measurable space is a pair (M, %) where M is a set and © is ao-algebra on 
M. The elements of © are called measurable subsets of M. 


Our goal is to assign volumes (i.e. measures) to subsets of a given set. Of course, we 
would also like this assignment to satisfy some sensible conditions. However, it turns out 
tha one poner ey assign volumes tq any: arbitrary collection of subsets of.a given 

' y that the collection of subsets be a’o-algebra. In addition, just like 
in topology openness and closeness are not properties of subsets but properties of subsets 
with respect to a choice of topology, so does measurability of subsets only make sense with 
respect to a choice of o-algebra. In particular, a given subset could be measurable with 
respect to some o-algebra and not measurable with respect to some other o-algebra. 


Example 5.4. The pair (M, A(M)) is a measurable space for any set M. Of course, just 
like the discrete topology is not a very useful topology, the power set A(M) is not a very 
useful o-algebra on M, unless M is countable. 


Definition. The extended real line is R := RU{—oo,+00}, where the symbols —oo and 
+oo (the latter often denoted simply by oo) satisfy 


VreR: —oo <r<o 


with strict inequalities ifr € R. The symbols +oo satisfy the following arithmetic rules 


Gi) VrER: toopr=rioo= atc 


(il) VPS Os Peboo) = cheer = =h00 


) 
(iii) Vr <0: r(4too) = toor = Foo 
) 


(iv) 0(-00) = +000 = 0. 


Note that expressions such as oo — oo or —oo + & are not defined. 


Onttps: //en.wikipedia.org/wiki/Non-measurable_set 


= ies 


Definition. Let (1, “) be a measurable space. A measure on (M, %) is a function 
pu: & — [0, oo, 
where [0,00] := {r €R | r > 0}, such that 


(i) p(2) =0 


(ii) for any sequence {Ay }new in & with A; A; = @ whenever i ¥ j, we have 
CO (ee) 
u( UJ An) = S| u(An). 
n=0 n=0 


A sequence {A;,}nen that satisfies the condition that A;N.A; = @ for alli ¥ 7 is called 
a pairwise disjoint sequence. 


Remark 5.5. Both sides of the equation in part (ii) of the definition of measure might take 
the value oo. There are two possible reasons why }>>° _, u(Ap) might be infinite. It could 
be that p4(A,) = oo for some n € N or, alternatively, it could be that u(A,) < oo for all 
n €N but the sequence of partialsums {  j.) u(Ai) }nen, which is an increasing sequence 


since ps is non-negative by definition, is ae above. 


Definition. A measure space is a triple (M, 41, 4) where (M, X) is a measurable space and 


ju: 4 — [0,00] is a measure on VM. 


Example 5.6. Let M = and © = A(N). Define uw : & > (0, co] by: 


n if A is a finite set with n elements 


oo.§6=Osoaif A is not a finite set 


y(A) = |Al := 


Then, by definition, u(@) = 0. Moreover, if {An}nen is a pairwise disjoint sequence in ¥ 
such that all but a finite number of A,,’s are empty and each A, has a finite number of 


elements, we have: 
CO CO 


el An) 


bey = 
by counting elements. Otherwise, Gt is, ie as number of A,,’s are non-empty or if 
at least one A, is infinite, then 


(Uae) == Yaa 


and thus, the triple (NN, A(N), 44) is a measure space. The measure pz on (IN, A(IN)) is called 
counting measure and it is the usual measure on countable measurable spaces. 


—~52- 


Proposition 5.7. Let (M, %,w) be a measure space. 
(i) If Ao,..., Ap € 4 and A;N Aj = @ for allO <1 4 9 <k, then: 


k k 
m( UJ An) — S = u(An) 
n=0 n=0 
(ii) If A,B €X and ACB, then p(A) < p(B) 


(ii) If A,B eX, ACB and p(A) < ~, then p(B \ A) = w(B) - (A). 


Proof. (i) Let A, = @ for alln > k. Then, {An}new is a pairwise disjoint sequence in & 
and hence, we have: 


k 00 00 k 
u U An) = u( U An) = 7 M(An) = D7 H(An)- 
n=0 n=0 n=0 n=0 
(ii) We have B = AU(B\ A) and AN (B \ A) = @. Hence, by part (i), 
w(B) = w(AU(B\ A)) = w(A) + w(B\ A), 
and since 1(B \ A) > 0, we have (A) < (B). 
(iii) By decomposing B as above and rearranging, we immediately get 


w(B\ A) = n(B) — pA). 


Note, however, that this only makes sense if uw(A) < oo, for otherwise we must also 
have (B) = co by part (ii), and then (B) — (A) would not be defined. LJ 


Proposition 5.8. Let (M,%, 4) be a measure space and let {An}nen be a sequence in d. 


(i) If {An}nen ts increasing, t.e. An C An+i for alln EIN, then 


CO 


jl An = lim p(A,). 


N—-Oo 


n=0 
(tt) If u( Ao) < co and {An}nen is fidedeasin, i.e. Ani C An for alln EWN, then 
(ee) 
u( () En) = tim L(A). 
n=0 
We say that pu is (i) continuous from below and (ii) continuous from above. 


Proof. (i) Define Bo := Ag and By := Ap \ An-1. Then, {Bn}nen is a pairwise disjoint 
sequence in & such that 


CoO CO 
|) Bi= An and LJ An = LU Bn: 
} n=0 n=0 


—53- 


Hence, we have 


= 
Se) 
S22 


l| 
igey 
ZB 
< 
———~ 
ic: 
Se) 
3 
NS” 


(ii) Define B, := Ag \ An. Then, By, C Bn+1 for alln € N and thus, by part (i), we have 


u( U Bn) ee) 
n=0 
= lim (u(Ao) — #(An)) 


= (A®) — rina p(A”). 
Note that, by definition of B,,, we have 


CO CO 
LJ Bn=Ao\ () An- 
n=0 n=0 
Since 14(Ag) < oo, it follows from the previous proposition that 
CO CO 
y( A) — “(1 An) = u( Ao) () An) 
n=0 n=0 
CO 
-(G. 
n=0 


= (Ad) = rdimo (An). 


Therefore, we have 
ee) 
(7) An) = lim p(An). O 
n= 


Remark 5.9. Note that the result in the second part of this proposition need not be true 
if (Ag) = oo. For example, consider (IN, A(IN), 4), where ys is the counting measure on 
(N, A(N)). If A, = {n,n+1,n+ 2,...}, then {An}nen is a decreasing sequence. Since 
u(A,) = oo for all n € IN, we have im U(A,,) = oo. On the other hand, ()?2_) An = © and 


thus (17-5 An) = 0. 


—~ 54 


Proposition 5.10. Let (M,%,) be a measure space. Then, w is countably sub-additive. 
That is, for any sequence {An}nen in X&, we have 


n( U 4s) < YHA) 


Proof. (a) First, we show thatu(A B) p(A)+ p(B) forany A,B &. Note that, for 
any pair of sets A and B, the séts ASB, B\ Aand ANB are p&rwise disjoint and 
their union is A U B. 


A B 


By writing A = (A\ B) U(AN B) and B = (B\ A) U(ANB) and using the additivity 
and positivity of 4, we have 


pA) + w(B) = 


[LL 
=H 
jl 


“— 


A\ B)U(AN B)) + w(B\ A) U(AN B)) 


A\ B)+2u(AN B) + p(B \ A) 
AUB)+u(An B) 
A 


(b) We now extend this to finite unions by induction. Let {An}nen be a sequence in & 


n(U 4) = Say 


for some n € N. Then, by part (a), we have 
n+1 n 
m( UJ As) = b (Anes U @ A.) 
i=0 i=0 
<p(Angi) tes | Ai 
i" 


n 


and suppose that 


< w(An+) + 2 p( Aj) 
n+l1 = 

= > u(Ai) 
i=0 


Hence, by induction on n with base case n = 1 and noting that the case n = 0 is 
trivial (it reduces to (Ag) = (Ao)), we have 


VneEN: “(U A.) - Ya 


i=0 


—55-— 


(c) Let {An}nen be a sequence in ©. Define B, := Usp An. Then, {Bn}nen is an 
increasing sequence in &. Hence, by continuity from above of yz, we have 


(Gs) =O» 


= lim L( Bn) 


N—->Oo 


n 


ae((*) 


| 


1=0 


n 
Jim, 2 (Aj) 
— 


So (Ai) 
i=0 


which is what we wanted. LJ 


IA 


| 


Definition. Let (M, %,) be a measure space. The measure yp is said to be finite if there 
exists a sequence {Ap}nen in © such that UP?) An = M and 

n IN: pA) < 
Example 5.11. The counting meastire 6n (IN, A(IN)) isSfnite. To see this, define A, := {n}. 
Then, clearly UP?.9 An = N and p(Ap) = |{n}| = 1 < c foralln EN. 
5.2 Borel o-algebras 


We have already remarked the parallel between topologies and o-algebras. A further sim- 
ilarity stems from the fact that, just like for topologies, interesting o-algebras are hardly 
ever given explicitly, except in some simple cases. In general, they are defined implicitly by 
some membership condition. 


Proposition 5.12. Let M be a set and let {4:1 € I} be a collection of a-algebras on M. 
Define the set 
Y= YD={Ae P(M)|Aed,,Viel}. 
i€l 
Then, 4 is a o-algebra on M 
Proof. We simply check ae satisfies the defining properties of a o-algebra. 


(i) We have M € &; for alli € J and hence M € &. 


(ii) Let A € &. Then, A € &; for alli € J and, since each ¥; is a c-algebra, we also have 
M\Ae€>®,; for alli eI. Hence, M\AEX. 


(iii) Let {An}nen be a sequence in ©. Then, {Ap}nen is a sequence in each 4;. Thus, 


CO 
Viel: [J An € Ei. 


n=0 


Hence, we also have 7°.) An €X. Oo 


U 


—56-— 


Definition. Let M be a set and let E C A(M) be a collection of subsets of M. The 
o-algebra generated by €, denoted o(€), is the smallest o-algebra on M containing all the 
sets in €. That is, 


Aéo(E) © forallo-algebras© on M: ECH > AED 
or, by letting {; |i € I} be the collection of o-algebras on M such that € C &, 


a(E) = yy: 
fe} 


The set E is called a generating set for a(€). Observe that the second characterisation 
makes it manifest that o(€) is indeed a o-algebra on M by the previous proposition. 


Theorem 5.13. Let(M, =) be a measurable space. Then,  =o(E) for some E C A(M). 


This generating construction immediately allows us to link the notions of topology and 
o-algebra on aset M via the following definition. 


Definition. Let (M, ©) be a topological space. The Borel o-algebra on (M, ©) is a(O). 


Recall that a topology on M is a collection O C A(M) of subsets of M which contains 
@ and M and is closed under finite intersections and arbitrary (even uncountable) unions. 
The elements of the topology are called open sets. Of course, while there many choices of 
o-algebra on M, if we already have a topology O on M, then the associated Borel o-algebra 
is very convenient choice of o-algebra since, as we will soon see, it induces a measurable 
structure which is “compatible” with the already given topological structure. 

This is, in fact, the usual philosophy in mathematics: we always let the stronger struc- 
tures induce the weaker ones, unless otherwise specified. For instance, once we have chosen 
an inner product on a space, we take the norm to be the induced norm, which induces a 
metric, which in turn induces a topology on that space, from which we now know how to 
obtain a canonical o-algebra. 

We remark that, while the Borel o-algebra on a topological space is generated by the 
open sets, in general, it contains much more that just the open sets. 


Example 5.14. Recall that the standard topology on I, denoted Or, is defined by 
A R a A: e>O0: r R: r acK<e r A. 


In fact, the eleractts of Op ate & most counta¥le Gnion$ of open intérvals in R. Consider 
now the Borel o-algebra on (R, Op). Let a < b. Then, for any n €N, the interval (a — i b) 
is open. Hence, {(a — - b) }new is a sequence in o(Op). Since o-algebras are closed under 
countable intersections, we have 


CO 


()(a— 4,0) = [a, b) € o(Op). 


n=0 
Hence, (Op) contains, in addition to all open intervals, also all half-open intervals. It is not 
difficult to show that it contains all closed intervals as well. In particular, since singletons 
are closed, o(Op) also contains all countable subsets of R. In fact, it is non-trivial! to 
produce a subset of R which is not contained in o(OR). 


“https: //en.wikipedia.org/wiki/Borel_set#Non-Borel_sets 


_57- 


5.3 Lebesgue measure on R?@ 


Definition. Let (MM, %, ) be a measure space. If A € © is such that (A) = 0, then A is 
called a null set or a set of measure zero. 


The following definition is not needed for the construction of the Lebesgue measure. 
However, since it is closely connected with that of null set and will be used a lot in the 
future, we chose to present it here. 


Definition. Let (, %, yu) be a measure space and let P be some property or statement. 
We say that P holds almost everywhere on M if 


4IZEU: pw(Z)=0 and VmeEM\Z: P(m). 


In other words, the property P is said to hold almost everywhere on M if it holds 
everywhere on M except for a null subset of M. 


Example 5.15. Let (M, %, 1) be a measure space and let f,g: M — N be maps. We say 
that f and g are almost everywhere equal, and we write f =a. g, if there exists a null set 
Z €% such that 

m M a f(m) =g(m). 


= 
The case f = g corresponds to Z = ©. 


Definition. A measure : 4% — [0,00] is said to be complete if every subset of every null 
set is measurable, i.e. 


VAEU: VBE AA): w(A=0> Bex. 


Note that since for any A,B € %, B C A implies u(B) < p(A), it follows that every 
subset of a null set, if measurable, must also be a null set. 


Definition. Let (1, 4, ) be a measure space and let (M,+,-) be a vector space. The 
measure yz is said to be translation-invariant if 


YVmeM:VAEN: At+me® and w(A+m) =p(A), 
where A+m:= {a+m|aeéA}. 


Theorem 5.16. Let Opa be the standard topology on R¢. There exists a unique complete, 
translation-invariant measure 
\4: 7 (Opa) > [0, 00] 


such that for all a;,b; €R with 1 <i<d anda; < b;, we have 


d 


4 (far, b1) Xr xX [aa, ba)) = [[@ = ai). 


i=1 


Definition. The measure \@ is called the Lebesgue measure on R¢. 


— 58 — 


The superscript d in \¢ may be suppressed if there is no risk of confusion. Note that 
the Lebesgue measure on R, R? and R® coincides with the standard notions of length, area 
and volume, with the further insight that these are only defined for the elements of the 
respective Borel o-algebras. 


Proposition 5.17. The Lebesgue measure on R is finite. 
Proof. Consider the sequence {[an, dn +1) }nen where 
—5n ifn is even 
 — 
" 3(n+1) ifn is odd. 


That is, {an }nen is the sequence (0,1, —1, 2, —2, 3, —3,...). 


Clearly, we have © [ay,a) +1) = R. Since, for alln WN, [a@n,@n+1) of p) and 
(ans An + ») = Lye eo, the Lebesgue measure X is finite.€ Ee O O 


This can easily be generalised to show that \4 is finite for all d > 1. 


5.4 Measurable maps 


As we have remarked earlier, once we introduce a new structure, we should immediately 
think about the associated structure-preserving maps. In the case of measurable spaces, a 
measurable map is one that preserves the “measurability” structure. 


Definition. Let (M, /y) and (N, Xn) be measurable spaces. A map f: M — N is said 
to be measurable if 
VAe un: preims(A) € Uy. 


Note that this is exactly the definition of continuous map between topological spaces, 
with ‘continuous” ied ae and topologies rep aced by o-alge ras. 


Lemma 5.18. Let (M, yz) and (N, =n) be measurable spaces. A map f: M > N is 
measurable if, and only if, 
VAEE: preim;(A) € Uy, 


where E C P(N) is a generating set of Un. 


Corollary 5.19. Let (M,Oy) and (N, On) be topological spaces. Any continuous map 
M -+ N ts measurable with respect to the Borel o-algebras on M and N. 


Recall that a map R —> R is monotonic if it is either increasing or decreasing. 


—59-— 


Corollary 5.20. Any monotonic map R > R is measurable with respect to the Borel o- 
algebra (with respect to Op). 


Proposition 5.21. Let(M, inz), (N, un) and (P, Up) be measurable spaces. If f: M + N 
and g: N — P are both measurable, the so is their composition go f: M — P. 


Proof. Let A Sp. As g is measurable, we have preim (A) yy. Then, since f is 
measurable, it follows that 7 € 


preim ,(preim,(A)) = preim,, (A) € Uy. 
Hence, g o f is measurable. ‘i 


Proposition 5.22. Let (M, Xjz) and (N, Un) be measurable spaces and let {fn}nen be a 
sequence of measurable maps from M to N whose pointwise limitis f. Then, f is measur- 
able. 


Recall that {fn}nen converges pointwise to f: M —> N if 
YmeM: lim f,(m) = f(m). 
noo 


This is in contrast with continuity, as pointwise convergence of a sequence of continuous 
maps is not a sufficient condition for the continuity of the pointwise limit. In the case 
of real or complex-valued maps, a sufficient condition is convergence with respect to the 
supremum norm. 


5.5 Push-forward of a measure 


If we have a structure-preserving map f between two instances A and B of some structure, 
and an object on A (which depends in some way on the structure), we can often use f to 
induce a similar object on B. This is generically called the push-forward of that object 
along the map f. 


Proposition 5.23. Let(M, Xj, u) be a measure space, let (N, Un) be a measurable space 
and let f:M WN beameasurable map. Then, the map 


—> 
que un — (0, oo] 


A ++ y(preim ¢(A)) 
is ameasure on (N, Un) called the push-forward of w along f. 


That f, is a measure follows easily from the fact that jz is a measure and basic 
properties of pre-images of maps, namely 


preim (A \ B) = preim,;(A) \ preim,;(B) 


preim (U A.) = | J preim (Ai). 


ie el 


— 60 - 


6 Integration of measurable functions 


We will now focus on measurable functions M — R and define their integral on a subset 
of M with respect to some measure on M, which is called the Lebesgue integral. Note 
that, even if M C R@, the Lebesgue integral of a function need not be with respect to the 
Lebesgue measure. 

The key application of this material is the definition of the Banach spaces of (classes 
of) Lebesgue integrable functions L?. The case p = 2 is especially important since L* is, in 
fact, a Hilbert space. It appears a lot in quantum mechanics where it is loosely referred to 
as the space of square-integrable functions. 


6.1 Characteristic and simple functions 


Definition. Let M be aset and let A € A(M). The characteristic function of A, denoted 
xa: M > R, is defined by 


1 ifmeaA 
0 ifm GA. 


Example 6.1. Below is the graph of x (1,9): R > R. 


The following properties of characteristic functions are immediate. 
Proposition 6.2. Let M be a set and let A,B € A(M). Then 
(i) Xe = 0 
(ti) XaUB = XA+ XB — XAnB 
(iti) XanB = XAXB 
(iv) xm\atxa=1 


where the addition and multiplication are pointwise and the 0 and 1 in parts (i) and (iv) 
are the constant functions M — R mapping everym€ M to0 €R and1€R, respectively. 


Definition. Let M be a set. A function s: M — R is simple if s(M) = {ri,...,Tn} for 
some n € N. 


Equivalently, s: M — R is simple if there exist r},...,7, € Rand Aj,...,An € A(M), 


n 
$= So TiX A: 
i=l 


So s is simple if it is a linear combination of characteristic functions. 


for some n € IN, such that 


a ie 


Example 6.3. Consider the simple function s: R — R given by s := X11,3) + 2X(2,5)- 


By observing the graph, we see that we can re-write s as 


$= X[1,2) + 3X [2,3] + 2X(3,5)- 


Definition. A simple function is said to be in its standard form if 
n 
= TiXAi 
i=1 
where A; MA; = @ whenever 2 F j. S- 


Any simple function can be written in standard form. It is clear that if s is in its 
standard form, then A; = preim,({r;}). 


Proposition 6.4. Let (M, %) be a measurable space and let A, Aj,..., An € A(M). Then 
(i) xa is measurable if, and only if, AE Xd 


ti) ifs = \~"_, rixa, 18 a simple function in its standard form, then s is measurable if, 
i=1 4 
and only if, we have A; © % for alll <i<n. 


Proof. (i) We have y4: M —R with possible values {0,1}. Equipping R with its Borel 
o-algebra w.r.t. the standard topology, a(Op), we have: 


(a) (=) We have 
A= M \preim, ,((0,1)), 


but [0,1) € o(Or) and so preim, ,([0,1)) € &. Then by property (ii) of a o- 
algebra EF ED. 


(b) (<=) Let a = [1,0o), 6 = (—oo,0) and y = [0,1]. Clearly a,8,7 € o(Opr) and 
aUBPUy=R. Then 
preim, ,(a) = A, 
preim, , (8) = 7; 
preim, ,(7) = M, 


all of which are measurable sets on M and so y , is measurable. 


— 62— 


(ii) First define ¥4, :=1ixa,, which satisfies 


= r, ifmeA; 
XA; (m) — 


As s is in its standard form we know x4, Xa, = © for alli = 7. Combining these 


two things, and defining a? = [r?, 00), 7 =0, r‘], the result foll6w from part (i). 


6.2 Integration of non-negative measurable simple functions 


We begin with the definition of the integral of a non-negative, measurable, simple function. 


Definition. Let (1, 4, w) be a measure space and let s: M — R be a nowhere negative, 
measurable, simple function whose standard form is s = )7/_, rix,;. Then, we define 


n 
| sdu:= So rip(Aj 
A i=l 


Note that the non-negativity condition is essential since takes values in [0, oo], hence 


Idh A‘) = o f that d if th di ts rt 
Mav oppastte atthe, there would have poe du 20 # She Coie Ronee Re Rned Bor ite 


same reason, we are considering s: M — |0,00) rather than s: M — (0, oo. 


Example 6.5. Consider the measure space (IN, A(IN), w), where w is the counting measure, 
let f: N—>R be non-negative and suppose that there exists N € N such that f(n) = 0 for 
alln > N. Then, we can write f as 


N 


Fa > fxg 


n=0 
That is, f is a non-negative, measurable, simple function and therefore 
N N 
fdpi= = f(n)u({n}) = 3 f(n). 
Hence, the “integral” of f [, IN with respect to the counting measure is just the sum. 


The need for simple functions to be in their standard form, which was introduced to 
avoid any potential ambiguity in the definition of their integral, can be relaxed using the 
following lemma. 


Lemma 6.6. Let (M,%, 1) be a measure space. Let s and t be non-negative, measurable, 
simple functions M —+ R and let c € [0, 00). Then 


[test 0au=c f sdu+ ff tay. 
M M M 


— 63 - 


Proposition 6.7. Let (M,%,,) be a measure space and let s = )or_, rixa, be a non- 
negative, measurable, simple function M — R not necessarily in its standard form. Then 


n 
| sdu= So riplAj 
M i=1 


Coroll t tb ti bl 
Simp € Wnetons (Mp Ste, be hat sat “C thdt is, ae y< ¢ m) or all 11 e ‘My. then” 


[seus [ tae 
M M 


Lemma 6.9. Let (M,%, us) be a measure space and let s = S>;_, rixa, be a non-negative, 
measurable, simple function M — R. Define the map 


Vs: 4 — (0, co] 
Av | SXA dp, 
M 


where sy 4 is the pointwise product of s and ya. Then, v, is a measure on (M,»). 


Proof. First, note that we have 
n n n 
[i sxadu= fi (Sorovaxa)au= ff (Sorocaina au = Dorado A). 
“ M \i=1 M \i=1 i=1 


We now check that v satisfies the defining properties of a measure. 


(i) Vs(D) = [ sxe dus = So rin(Ai ‘al @) = S- ri(2) = 
i=1 


i=1 


(ii) Let {B;}j;en be a pairwise disjoint sequence in ©. Then 


Thus, v; is a measure on (M, »). O 


a) 


6.3 Integration of non-negative measurable functions 


As we are interested in measurable functions M — R, we need to define a o-algebra on R. 
We cannot use the Borel o-algebra since we haven’t even defined a topology on R. In fact, 
we can easily get a o-algebra on R as follows. 


Proposition 6.10. The set®:= A A(R) A R_ o( R) is ao-algebraonR. 


cers oe ee — 

In other words, we can simply ignore the infinities in a subset of R and consider it to 

be measurable if A \{—0oo, +00} is in the Borel o-algebra of R. We will always consider R 
to be equipped with this o-algebra. 


Lemma 6.11. Let(M, %) be a measurable space and let f,g: M > R be measurable. Then, 
the following functions are measurable. 


(i) cf +g, foranyceR 

(ii) |f| and f? 
(itt) fg (pointwise product) if f and g are nowhere infinite 
(iv) max(f, g) (defined pointwise). 


Definition. Let (M,%,w) be a measure space and let f: M > R be a non-negative, 
measurable function. Denote by S the set of all non-negative, measurable, simple functions 
s: M - Rsuch that s < f. Then, we define 


fdyi:=sup | s du. 
M sES JM 


Remark 6.12. It is often very convenient to introduce the notation 


[ fleymac)= f fan, 


where x is adummy variable and could be replaced by any other symbol. The reason why 
this is a convenient notation is that, while some functions have standard symbols but cannot 
be easily represented by an algebraic expression (e.g. characteristic functions), others are 
easily expressed in terms of an algebraic formula but do not have a standard name. For 
instance, it is much easier to just write 


[2 wae) 


than having to first denote the function R > R, 2 + 2x? by a generic f or, say, the more 


i Sp dp. 
R 


In computer programming, this is akin to defining anonymous functions. 


specific sqp, and then write 


—65- 


Definition. Let (M,%,w) be a measure space and let f: M > R be a non-negative, 
measurable function. For any A € ¥ (that is, any measurable subset of M), we define 


[teu = [fxs du. 


Note that the product fy, is measurable by part (iii) of Lemma 6.11. 


Lemma 6.13. Let (M,%,) be a measure space, let f,g: M > R be non-negative, mea- 
surable functions such that f <g, and let A,B © be such that AC B. Then 


(1) [is du< [seu 


(i) [tans [ fay. 


Proof. (i) Denote by Sy and Sy the sets of non-negative, measurable, simple functions 
that are less than or equal to f and g, respectively. As f <g, we have S> C S, and 


fdu = sup | sdu< sup f sdu=: f gdp. 
M seS¢ JM se€Sg JM M 


(ii) Since A C B, for any m € M we have 


hence 


f(m)xa(m) < f(m)xB(m). 


In fact, we have equality whenever m € A or m € M \ B, while for m € B\ A the 
left hand side is zero and the right-hand side is non-negative. Hence, fy, < fxg and 
thus, by part (i), we have 


[teu = [ fradus f txoay = [few a 


Proposition 6.14 (Markov inequality). Let (M, X, 11) be a measure space and let f: M—>R 
be a non-negative, measurable function. For any z € |0, oo], we have 


| fdp > z u(preim ¢([z, 00])). 
M 


Equality is achieved whenever z is an upper bound for f. 


preim ¢([z, oo]) 


~ 66 — 


The following is the pivotal theorem of Lebesgue integration. 


Theorem 6.15 (Monotone convergence theorem). Let(M, %, 4) be a measure space and let 
{fn}nen be a sequence of non-negative, measurable functions M — R such that fn4i > fn 
for all n EN. If there exists a function f: M — R such that 


m M: lim f,(m) = f(m) 
V E N—-0o 
(i.e. f is the pointwise limit of {fn}nen), then f is measurable and 


lim j,al = | 7du. 


Remark 6.16. Observe that this result is in stark contrast with what one may be used 
from Riemann integration, where pointwise converge of a sequence of integrable functions 
{fn}nen is not a sufficient condition for the integral of the limit f to be equal to the limit 
of the integrals of f, or, in fact, even for f to be integrable. For these, we need stronger 
conditions on the sequence {fn}nen, such as uniform converge. 


The definition of the integral as a supremum is clear and geometrically reasonable. 
However, it is in general very difficult to evaluate the integral of any particular function 


using it. The monotone convergence theorem provides a much, simpler way to evaluate the 
integral. One can show that, for any non-negative, measurable function f, there exists an 


increasing sequence {5,}nen of non-negative, measurable, simple functions (which can be 
explicitly constructed from f) whose pointwise limit is f, and hence we have 


| fau= tim, | sn dy 
M noo M 


where the right-hand side can usually be evaluated fairly easily. 


Example 6.17. Consider the measure space (IN, A(IN), w), where yz is the counting measure, 
and let f: N > R be non-negative. Note that the choice of o-algebra A(IN) on N makes 
every function on N (to any measurable space) measurable. Define, for every n € N, 


n a {i} 
Then, {Sn}nen is an increasing sequencé c non-negative, measurable, simple functions 
whose pointwise limit is f and therefore, by the monotone convergence theorem, 


n CO 
[fen = jim, [ sn dy = Jim, DLO) = 40. 
Vw — 
If you ever wondered why series seem to share so many properties with integrals, the reason 
is that series are just integrals with respect to a discrete measure. 


The monotone convergence theorem can be used to extend some of the properties of in- 
tegrals of non-negative, measurable simple functions to non-negative, measurable functions 
which are not-necessarily simple. 


=h7 = 


Lemma 6.18. Let (M,%,) be a measure space, let f,g: M > R be non-negative, mea- 
surable functions and let c € [0,co). Then 


(1) [ict + g)du=ce is du + [is du 


(ii) the mapy :% (0, | defined byv (A):=  f dp is a measure on (M, %) 
ra +> ow f 


A 
(itt) for any AEX, we have [ jae fraud f du. 
M A M\A 


Proof. (i) Let {sn}nen and {tr }nen be increasing sequences of non-negative, measurable, 
simple functions whose pointwise limits are f and g, respectively. Then, it is easy to 
see that {cs, + tn}nen is an increasing sequence of non-negative, measurable, simple 
functions whose pointwise limit is cf + g. Hence, by Lemma 6.6 and the monotone 
converge theorem 


| (cf+g)du= lim | (c5,)+ tn) du 
M 


M noo 


(ii) To check that vy is a measure on (M, %), first note that we have 


Vs(O) = [few = | fxo du = 0. 
@ M 
Let {Aj }ien be a pairwise disjoint sequence in }. Define, for any n € N, 
In = IX(Un gas)’ 


Since, for all n € N, we have, 2, A; C , 1} A; and f is non-negative, {fn}nen is 
an increasing sequence of non-négative, measurable, simple functions whose pointwise 


— 68 — 


limits is f X (U9 Ai): Hence, by recalling Proposition 6.2, we have 


HC) fo 
U Uzzo Ai 


M 1=0 
nm 
= (Qe feat) 
_- 
= S > v;(Ai) 
1=0 


(iii) Note that AN (M \ A) = ©. Hence, by using the fact that vy from part (ii) is a 
measure on (M, %), we have 


I ie f du. O 
M A M\A 


Part (i) of the previous lemma and the monotone convergence theorem also imply that, 
for any sequence {f,,}nen of non-negative, measurable functions, we have 


I, (Jn) an y ff oe 


Again, note that this result does not hold for the Riemann integral unless stronger condi- 
tions are places on the sequence {fn}nen.- 
Finally, we have a simple but crucial result for Lebesgue integration. 


Theorem 6.19. Let (M,™, 4) be a measure space and let f: M R be a non-negative, 
measurable function. Then = 


: fap] = FH 560: 
M 
Proof. (=>) Suppose that J,, f du = 0. Define A, := {m €M | f(m) > re, and let 


ee oe 
Sn -= F41X An: 


By definition of A,,, we clearly have s, < f for all nm € IN. Hence, 


0<f sndus f fay=o 
M M 


— 69 — 


and thus, fy, 5n du = 0 for all n € N. Since by definition 


| Sn dpe = 5G MAn); 
M 


we must also have u(A,,) = 0 for alln € IN. Let A := {m € M | f(m) #0}. Then, as 
f is non-negative, we have 


A=|J4n=LJ{meM | f(m) > 4} 


n=0 n=0 


and, since A, C An41 for all n € N, we have 
CO 
(A) = ( U An) = lim “(An) = 0. 
n= 


Thus, f is zero except on the null set A. That is, f =a... 0. 


— 


Ss’ 


Suppose that f =,.e. 0. Let S be the set of non-negative, measurable, simple functions 
s such that s < f. As f =a. 0, we have s =, ¢. 0 for alls € S. Thus, if 


n 


s= TiXA;, 


we must have either r; = 0 or w(A;) = 0 for all 1 <i <n. Hence, for all s € S, 


n 
| sdy:= So riti(Ai) =), 
“ i=1 
Therefore, we have 


| faui=sup | sdp = 0. O 
M seS JM 


This means that, for the purposes of Lebesgue integration, null sets can be neglected 
as they do not change the value of an integral. The following are some examples of this. 


Corollary 6.20. Let (M,%,j) be a measure space, let Ac & and let f,g: M > R be 
non-negative, measurable functions. 


(i) If p(A) = 0, then f fau=0 

A 
yi —=a.e. J; h du = d 
(ii) If f g, then Le pu [is lu 


(iii) If f <a. g, then | fdy< 7 gdp. 
M M 


Proof. (i) Clearly, fa =a.e. 0 and hence 


[teu = [frac = 0. 


— 70- 


(ii) As f =a. g, we have f — g =,,.¢, 0 and thus 


0= fy —g)dp= [is du — Js du. 


(iii) Let B:= {me M | f(m) > g(m)}. As f <ae. g, we have u(B) = 0 and f < gon 


[ feaw= [tant | fay s | aque f gay 
M B M\B M\B M 


where we used part (i) and Lemma 6.13. O 


M \ B. Thus 


Example 6.21. Consider (R,7(Op), A) and let f: RR be the Dirichlet function 


_ Ji ifr eQ 
fr) = ‘0 ifr ER\Q. 


The Dirichlet function is the usual example of a function which is not Riemann integrable 

(on any real interval). We will now show that we can easily assign a numerical value to 
IK R 

HS PRteeeG NS LOVeRRS HAR Ge Psah AontyHPSt, note that a set A € o(O ) is null with 


oo CO 
Ve >0:5{In}nen: AC|)In and S$) In) <e 
n=0 n=0 


where {In}nen is a sequence of real intervals. From this, it immediately follows that any 
countable subset of R has zero Lebesgue measure. Thus, A(Q) = 0 and hence, f =a.e. 0. 
Therefore, by the previous lemmas, we have 


[taro 


for any measurable subset A of R. 


6.4 Lebesgue integrable functions 


Since the difference oo — oo is not defined, we cannot integrate all measurable functions. 
There is, however, a very small extra condition (beyond measurability) that determines the 
class of functions to which we can extend our previous definition. 


Definition. Let (M, X,) be a measure space and let f: M —> R. The function f is said 
to be (Lebesgue) integrable if it is measurable and 


[lle < 00. 


We denote the set of all integrable functions M — R by -43(M, 5, ), or simply #!(M) if 
there is no risk of confusion. 


oF 


For any f: M — R, we define ft := max(f,0) and f~ := max(—/f,0), which are 
measurable whenever f is measurable by part (iv) of Lemma 6.11. 


f iad fo 


Observe that f = f* — f~ and |f| = ft +f 7. Clearly, we have ft < |f| and f~ < |f], 
and hence 


[lla <oo = [ fran < and [ fo an< co. 


Definition. Let (/, ©, w) be a measure space and let f: M > R be integrable. Then, the 
(Lebesgue) integral of f over M with respect to pu is 


fduw:= , f+du— 


It should be clear that nde of the (of. f ur |f|du < 0c is to prevent 
the integral of f from being oo — ov, which is not defined. 

In quantum mechanics, we usually deal with complex functions. The extension to the 
complex case of the integration theory presented thus far is straightforward. 


Definition. Let (1, %,) be a measure space. A complex function f: M — C is said to 
be integrable if the real functions Re( f) and Im(f) are measurable and 


[lle < 00, 


where f denotes the complex modulus, i.e. f 7 = Re(f)? + Im(f)*. We denote the set 
of all int¢grable complex functions by -24(M, b] 1), or simply @1!(M) if there is no risk of 
confusion. 


Definition. Let (/, %, w) be a measure space and let f: M — C be integrable. We define 


[ few = [ Ren uti f tmp) dp. 


The following lemma gives the properties expected of sums and scalar multiples of 
integrals. Note, however, that before we show that, say, the integral of a sum is the sum of 
the integrals, it is necessary to first show that the sum of two functions in 41(M) is again 


in Y1(M). 


= Fo 


Lemma 6.22. Let (M,», 1) be a measure space, let f,g € Z'(M) and letc ER. Then 


[ fens [lle 


(ii) cf € ZI(M) and =cfdu=c_ fdu 
M M 


(iti) ftgeZ'(M ane (f +9) aa ae ae 


(iv) L'(M) is a vector space. 


(i) |f| € ZI(M) and 


Proof. (i) As ||f|| =|f|, we have |f| € #'(M). Then, by the triangle inequality, 


[rul=|forw- [rw 
c\frrale| ra 
[pow [e 


+ = 
f +f )dp 
|f| du. 

M 


[ leflan= J ellflau= lel f flan < oo 


and hence, we have cf € 71!(M 
Suppose c > 0. Then, (cf)* = _ and (cf)~ =cf~ and thus 


[et du= [. (cf)* dp [err dy 


=a a ifs ‘h 


=e f (t= f-)au 


| 


| 


| 


(ii) We have 


— 73- 


Now suppose c = —1. Then, (—f)t = f~ and (—f)~ = ft. Thus 


The case c < 0 follows by writing c = (—1)(—c) and applying the above results. 


(iii) By the triangle inequality, we have |f + g| < |f| + |g| and thus 


[lttslans f flau+ | \g| du < co. 
M M M 
Hence, f + g € ¥'(M). Moreover, we have 


ie a ice alla Mile filled ne a ad Cada a 
where ft +gt and f~ +g7~ are non-negative and measurable. Note that, while 


(ftg)t Aftt+gt and(f+g) #f~+ 97, one can show that ft +gt and f~ +g— 
give an equivalent splitting of f + g to define its integral. Therefore 


[gtaa= ftttotan— f+ eye 


= frt-ryant fo - oem 


| 


fdu+  gdu. 
M M 


(iv) The set of all functions from mf to R is dee space. By parts (ii) and (iii), we have 
that @!(M) is a vector subspace of re vector space and hence, a vector space in its 
own right. 0 


Some properties of the integrals of non-negative, measurable functions easily carry over 
to general integrable functions. 


Lemma 6.23. Let (M,», 1) be a measure space and let f,g € Z'(M). 


(i) If f =ae.g, then La du = [is du 


(tt) If f Sac.g, then fduw< gdp. 
M M 


Jf 


nee 


Just as the monotone convergence theorem was very important for integrals of non- 
negative, measurable functions, there is a similar theorem that is important for integrals of 
functions in #!(X). 


Theorem 6.24 (Dominated convergence theorem). Let (M,%,w) be a measure space and 


let {fn}nen be a sequence of measurable functions which converges almost everywhere to a 
1 


measurable function f. If there exists g € L (M) such that |fr| <ae. g for alln EN, then 
(i) f © ZY (M) and fn € Z'(M) for alln EN 


(i) Jim. f Vin Flan =0 


a, tes | dies | in 
Remark 6.25. By “{fn}new converges almost everywhere to f” we mean, of course, that 
there exists a null set A € © such that 


YVmeM\A: jim fn(m) = fm): 


6.5 The function spaces L?(M, ™, pw) 


UO eee B EE EHR TAAMSHS eRe P ASS BI ee ah eEP RD Mit 2p See woawnte 
a 


Definition. Let (M, %, 4) be a measure space and let p € [1, 00). We define 


Lp M, 3) = if M—->R | f is measurable and | |f\P du < oo 
M 
and, similarly, 
LE(M, &, p) = if M->C | Re(f) and Im(f) are measurable and | |f|P du < oo} 
M 


Whenever there is no risk of confusion, we lighten the notation to just @?. 


Definition. Let (M, », 4) be a measure space and let f: M > R. The essential supremum 
of f is defined as 


Le Mi inf{c € f <e ch, 
Then, f is said to be dinbie: ene ere bounded y rom n above) if esssup f < oo. 


Alternatively, f : M — R is almost everywhere bounded if there exists a null set A € X 
such that f restricted to M \ A is bounded. 


Definition. Let (M, %, 1) be a measure space. We define 
ee (Mf) = ce M—->R | f is measurable and esssup |f| < co} 
and, similarly, 


2 (MM, & pt) = fi M->C | Re(f) and Im(f) are measurable and esssup|f| < oo} 


Whenever there is no risk of confusion, we lighten the notation to just #?, for p € [1, co]. 


—75- 


All the &? spaces become vector spaces once equipped with pointwise addition and 
multiplication. Let us show this is detail for a. 


Proposition 6.26. Let(M, %,) be a measure space. Then, Ba is a complex vector space. 


Proof. The set of all functions M — C, often denoted M©, is a vector space under pointwise 
2 C 


addition and multiplication. Hence, it suffices to show that © isasubspace of M . 
(i) Let f € Zé and z €C. As |z| € R, we have: 


i: ef dp = |e? | FP < 00 
M M 


and hence, zf € Z@. 


(ii) Let f,g € ZZ. Note that 


lft go? =(f+o9ft+9) 


= (f+ )F+9) 
=ff+fotof+ 99 
2 - 7 2 
=|f| +fo+of+|gl - 
Moreover, as 


0<|f-g9? =If? — fo - of + lal’, 
we have fg +f < |f|? + |g|?, and thus 
If+ al? <2If/? + 2lg/. 


Therefore 


[ittoPanse f \Pae+2 f Igl? du < 00, 
M M M 
and hence f + g € Zé. O 


Ideally, we would like to turn all these Y? space into Banach spaces. Let us begin by 


equipping them with a weaker piece of extra structure. 


Proposition 6.27. Let (M,%,) be a measure space and let p € [0, co]. Then, the maps 
ll - lp: ? — R defined by 


IFlp = (ran) for <p <oo 


esssup|/| for p= 
are semi-norms on Y?. That is, for all z €C and f,g € #", 
(t) \\fllp 29 
(tt) ll2fllp = lll fllp 


— 76-— 


(itt) lf + gllp < IIfllp + Ilgllp- 


In other words, the notion of semi-norm is a generalisation of that of norm obtained by 
relaxing the definiteness condition. If the measure space (M, &, 4) is such that the empty 
set is the only null set, then |] - ||, is automatically definite and hence, a norm. 


Example 6.28. Consider (N, A(N), 4), where yz is the counting measure. Then, as (A) is 
the cardinality of A, the only null set is the empty set. Thus, recalling that functions on N 
are just sequences, the maps 


{an }nenllp = (3° lon)’ for l <p <o 


n=0 


sup{lan||m EN} for p= co 
are norms on ¥?(N). In particular, note that we have #2(IN) = @?(IN). 


However, in general measure spaces, we only have 


lf llp =0 © if =a.e. 0, 


as we have shown in Theorem 6.19 for #!, and it is often very easy to produce an f 4 0 
such that || ||P = 0. The solution to this problem is to construct new spaces from the /? in 
which functions that are almost everywhere equal are, in fact, the same function. In other 
words, we need to consider the quotient space of Y? by the equivalence relation “being 
almost everywhere equal”. 


Definition. Let M be aset. An equivalence relation on M is aset ~ C M x M such that, 
writing a ~ 6 for (a,b) € ~, we have 


(i) a~a (reflexivity) 
(ii) a~b & bna (symmetry) 
(iii) (a~bandb~c) > anc (transitivity) 


for all a,b,c € M. If ~ is an equivalence relation on M , we define the equivalence class of 
méM by |m|:= {a € M | m~a} and the quotient set of M by ~ by 
M/~:= {{m] | me M}. 
It is easy to show that M/~ is a partition of M, i.e. 


M= UJ [m] and [a] N [b] = @ whenever a # b. 
meM 


In fact, the notions of equivalence relation on M and partition of M are one and the same. 


Lemma 6.29. Let (M, %, 1) be a measure space and let ~ be defined by 


frg = I =neG: 


Then, ~ is an equivalence relation on LY”. 


—77- 


Proof. Let f,g,h € Y”. Clearly, f ~ f and f~g sg ~ f. Now suppose that f ~ g and 
g ~h. Then, there exist null sets A,B € © such that f =gon M\Aandg=honM \B. 
Recall that o-algebras are closed under intersections and hence, AM B € }. Obviously, we 
have f =hon M \ (ANB) and, since 


WA B) pA B) p(A)+pH(B)=9, 
the set AM B is null. Thus, a = OL 
Definition. Let (M, %, 1) be a measure space and let f ~g © f =a.e. g. We define 

P= £?/~={f|| fe LP. 
Lemma 6.30. Let (M,%, 1) be a measure space. Then, the maps 
(i) +: L? x LP — LP defined by [f] + [9] := [f+] 
(ii) -: C x L? — LP defined by z|f] := [zf] 
(itt) || - |lp: L? > R defined by ||[f]llp:= lf llp 
are well-defined. Moreover, p: LP Ris anorm on L?. 


Lemma 6.31 (Hdlder’s ineduiality). Lét (M, x, ) be a measure space and let p,q € [1, co] 
be such that : + : = 1 (where = := 0). Then, for all measurable functions f,g: M —> C, 


we have : : 
|= (ves) (Lara) 


Hence, if | f] € L? and [g] € LY, with oe : = 1, then [fg] € L+ and 


Falla < (LATIN LgTh?. 


The equality holds if and only if | f|? and |g|% are linearly dependent on L'. That is, 
there exists non-negative real numbers a, 3 € R such that 


a f - ~a.e. B g ‘ 
Note it is clear that @ and ( cannot bettl vanish. | 


Theorem 6.32. The spaces L? are Banach spaces for all p € [0, ov]. 


We have already remarked that the case p = 2 is special in that L? is the only L” space 
which can be made into a Hilbert space. 


Proposition 6.33. Let (M,%,w) be a measure space and define 
(|J72: D2 x oC 
(All) | Fad. 
M 


Then, (-|-);2 is well-defined and it is a sesqui-linear inner product on L?. 


— 78 — 


Proof. First note that if [f] € L?, then [f] € L? and hence, by Holder inequality, [fg] € L!. 
This ensures that ([f]|[9]) ,2 €C for all [f], [g] € L?. 
To show well-definedeness, let f’ =a.c. f and g’! =ac.g. Then, f’g’ =ae. fg and thus 


" pia | Po'de= | Foqu = (Fl [a] po: 


Now, let [/], al, E Ly and z €C. Then ( | ) 


©) Cilla) 2 = f Foae= f roan = tall) 


(ii) We have 


([fllz(9] + (Al) p2 = [ tes eer 


=p [is du + [ip du 


= z([fI|Ig]) 2+ (LIAL) p2 


(iii) We have [f] [f] .= (jf ae= ge _ 0 and 
L M| | 


(LATIF) J. 0 wd f Pau |[[f]ll2= 0. 


Thus, [f] = 0 := [0]. O 


The last part of the proof also shows that (-|-);2 induces the norm || - ||2, with respect 
to which L? is a Banach space. Hence, (L7?, (-|-);2) is a Hilbert space. 


Remark 6.34. The inner product (-|-)2 on L2(N, A(IN), 4) coincides with the inner product 
(-|-) 2 on £2(IN) defined in the section on separable Hilbert spaces. 


7G 


7 Self-adjoint and essentially self-adjoint operators 


While we have already given some of the following definitions in the introductory section 
on the axioms of quantum mechanics, we reproduce them here for completeness. 


7.1 Adjoint operators 


Definition. A linear map or operator A: DA — H is said to be densely defined if DA is 
dense in H, i.e. 


Ve>O:VWeEH: da eD,: |la-—y|| <e. 


Equivalently, Da =H, ie. for every w © H there exists a sequence {Qn}nen in D4 
whose limit is w. 


Definition. Let A: D4 > H bea densely defined operator on H. The adjoint of A is the 
operator A*: D4» — H defined by 


(i) Daw :={WEH|in EH: Va ED,: (w~|Aa) = (nla)} 
(ii) A*y := 7. 
Proposition 7.1. The adjoint operator A*: « is well-defined. 
DA —-H 
Proof. Let w €H and let 7,7 € H be such that 
VaEeDa: (|Aa) = (nla) and (~|Aa) = (7a). 
Then, for all a in Dy, we have 
(7 — Hla) = (nla) — (ila) = (| Aa) — (| Aa) = 0 
and hence, by positive-definiteness, 7 = 7. O 


If A and B are densely defined and D4 = Dz, then the pointwise sum A + B is clearly 
densely defined and hence, (A + B)* exists. However, we do not have (A + B)* = A* + B* 
in general, unless one of A and B is bounded, but we do have the following result. 


Proposition 7.2. If A is densely defined, then 
(A+ zidp,)* =A*+7Zidp, 
for any z EC. 


The identity operator idp, is usually suppressed in the notation, so that the above 
equation reads (A + z)* = A* +7. 


Definition. Let A: D4 > H be a linear operator. The kernel and range of A are 
ker(A) := {a € Dg | Aa = 0}, ran(A) := {Aa |a € D4}. 


The range is also called image and im(A) and A(D4) are alternative notations. 


— &§0 -— 


Proposition 7.3. An operator A: D4 > H is 
(i) injective if, and only if, ker(A) = {0} 
(ii) surjective if, and only if, ran(A) = H. 


Definition. An operator A: 4 is invertible if there exists an operator B: A 
such that A o B= idH and BB A= Da. H — D 


Proposition 7.4. An operator A is invertible if, and only if, 
ker(A) = {0} and ran(A) =H. 
Proposition 7.5. Let A be densely defined. Then, ker(A*) = ran(A)+. 
Proof. We have 
w eker(A*) & A*P=0 ©& VaeED,: (y\Aa)=0 © w Eran(A)-. oO 
Definition. Let A: D4 — H and B: Dg — H be operators. We say that B is an extension 
of A, and we write A C B, if 
(i) DAC DB 
(ii) Va€ Dg: Aa= Ba. 
Proposition 7.6. Let A, B be densely defined. If A C B, then B* C A*. 
Proof. (i) Let € Dg«. Then, there exists 7 € H such that 
VBEDp: (H|BB) = (nl8). 
In particular, as A C B, we have D4 C Dp and thus 
VaeDa CDzg: (b|Ba) = (b|Aa) = (nla). 
Therefore, 7% € Dy and hence, Dgx C Da~. 
(ii) From the above, we also have B*w := n =: A*w for all w € DB*. L 


7.2 The adjoint of a symmetric operator 


Definition. A densely defined operator A: D4 > H is called symmetric if 
Va,BED,: (alAG) = (AaB). 


Remark 7.7. In the physics literature, symmetric operators are usually referred to as Her- 
mitian operators. However, this notion is then confused with the that of self-adjointness 
when physicists say that observables in quantum mechanics correspond to Hermitian op- 
erators, which is not the case. On the other hand, if one decides to use Hermitian as a 
synonym of self-adjoint, it is then not true that all symmetric operators are Hermitian. In 
order to prevent confusion, we will avoid the term Hermitian altogether. 


= Ri 


Proposition 7.8. If A is symmetric, then A C A*. 


Proof. Let w € Dag and let 7 := Aw. Then, by symmetry, we have 
Va €Da: (Aa) = (nla) 
and hence w «. Therefore, « and A*w:=n = Aw. O 
€ DA DA CDA 
Definition. A densely defined operator A: D4 > H is self-adjoint if A = A*. That is, 
(ii) Vae€ Dg: Aa= A*a. 


Remark 7.9. Observe that any self-adjoint operator is also symmetric, but a symmetric 
operator need not be self-adjoint. 


Corollary 7.10. A self-adjoint operator is maximal with respect to self-adjoint extension. 


Proof. Let A, B be self-adjoint and suppose that A C B. Then 


A B=B* A*=A 
Cc Cc 
and hence, B = A. O 


In fact, self-adjoint operators are maximal even with respect to symmetric extension, 
for we would have B C B* instead of B = B* in the above equation. 


7.3 Closability, closure, closedness 


Definition. (i) A densely defined operator A is called closable if its adjoint A* is also 
densely defined. 


(ii) The closure of a closable operator is A := A** = (A*)*. 


(iii) An operator is called closed if A = A. 


RemarkLt1l. Note that we have used the overline notation in several contexts with different 
meanings. When applied to complex numbers, it denotes complex conjugation. When 


applied to subsets of a topological space, it denotes their topological closure. Finally, when 
applied to (closable) operators, it denotes their closure as defined above. 


Proposition 7.12. A symmetric operator is necessarily closable. 


Proof. Let A be symmetric. Then, A C A* and hence, D4 C Dy. Since a symmetric 
operators are densely defined, we have 


Hence, Dax = H. O 


— 82 — 


Note carefully that the adjoint of a symmetric operator need not be symmetric. In 
particular, we cannot conclude that A* C A**. In fact, the reversed inclusion holds. 


Proposition 7.13. If A is symmetric, then A** C A*. 
Proof. Since A is symmetric, we have A C A*. Hence, A** C A* by Proposition 7.6. LJ 


Lemma 7.14. For any closable operator A, wehave ACA . 
Proof. Recall that Das := {yw €H|Vae Da: (~|Aa) = (nl|a)}. Then 


Vp ED :VaE Da: (Aa) = (A* Ia). 


Since ~ and @ above are “dummy variables”, and the order of quantifiers of the same type 
is immaterial, we have 


VwEeDag:VaEDy,: (alAy) = (A*aly). 


Then, taking the conjugate on both inner product terms and moving them across the equal 
sign, we have 


VweDag:VaEDy,: (WlA*a) = (Avia). 


Now, letting 7 := Aw © H we see that wy € Da», and so D4 C Dax. Moreover, by 


KK KK UJ 
definition, A w:=7:= Aw for ally € A, andthus A CA 


Corollary 7.15. If A is symmetric, then AC AC A*. 
Corollary 7.16. If A is symmetric, then A is symmetric. 
Theorem 7.17. Let A be a densely defined operator. 
(i) The operator A* is closed 
(ii) If A is invertible, we have (A~')* = (A*)~4 
(iti) If A is invertible and closable and A is injective, then A~! =A = 


These theorems are proved using a graphical formulation which has not been presented 
here, so in order to save space all the proofs are not given. However as a example of the 
technique, we shall provide the proof for (i).!” 


Proof. Let H and K be two Hilbert spaces and let A : H — K be densely defined. We 
define the graph of A as 
T(A) := {(h, Ah) |h € Dy}. 
In this context, we say A is closed if and only if (A) is closed w.r.t. the product topology 
on H @K. Next define the operator 
JF-HOKAOHOK 
h@k+ (—k) @h. 

We now wish to show that ['(A*) & [.7(I'(A))]+, as the right hand side is closed due 

to the orthogonal projection, which gives us that A* is closed. 


Many thanks to Alfredo Sepulveda-Ximenez for providing a brilliant answer to this on Quora. 


— §3- 


(=) Let z € Dy andy € Das. We have y @ A*(y) €T'(A*), and 


(y BA*(y), F(a @ A(x)) = (y ® A*(y), —A(@) © 2) 
= AU; A(x 


and so '(A*) C [7(I(A))]+. 
(<=) Let xz € Dy andy @z €[J(I(A))]+. Then, 
(y @ z, —A(a) @ x) = 0 
(y, A(x)) = (2, #), 


which, from the definition of the adjoint, tells us that y € D4» and z = A*(y) and so 
y ®z €1(A*) and [J(I(A))]+ CI(A’*). 


OU 


7.4 Essentially self-adjoint operators 


Usually, checking that an operator is symmetric is easy. By contrast, checking that an 
operator is self-adjoint (directly from the definition) requires the construction of the adjoint, 
which is not always easy. However, since every self-adjoint operator is symmetric, we 
can first check the symmetry property, and then determine criteria for when a symmetric 
operator is self-adjoint, or for when a self-adjoint extension exists. 

Two complications with the extension approach are that, given asymmetric operator, 
there could be no self-adjoint extension at all, or there may be several different self-adjoint 
extensions. There is, however, a class of operators for which the situation is much nicer. 


Definition. A symmetric operator A is called essentially self-adjoint if A is self-adjoint. 
This is weaker than the self-adjointness condition. 
Proposition 7.18. If A is self-adjoint, then it is essentially self-adjoint. 


Proof. If A = A*, then A C A* and A* C A. Hence, A** C A* and A* C A**, so A* = A**. 
Similarly, we have A** = A***, which is just A = A’. O 


Theorem 7.19. Jf A is essentially self-adjoint, then there exists a unique self-adjoint ex- 
tension of A, namely A. 


Proof. (i) Since A is symmetric, it is closable and hence A exists. 
(ii) By Lemma 7.14, we have A C A, so A is an extension of A. 


(iii) Suppose that B is another self-adjoint extension of A. Then, A C B = B*, hence 
B** C A*, and thus A* C B*** = B. This means that A C B, i.e. B is a self-adjoint 
extension of the self-adjoint operator A. Hence, B = A by Corollary 7.10. O 


— 84 — 


Remark 7.20. One may get the feeling at this point that checking for essential self-adjointness 
of an operator A, i.e. checking that A** = A***, is hardly easier than checking whether A 
is self-adjoint, that is, whether A = A*. However, this is not so. While we will show below 
that there is a sufficient criterion for self-adjointness which does not require to calculate 
the adjoint, we will see that there is, in fact, a necessary and sufficient criterion to check 
for essential self-adjointness of an operator without calculating a single adjoint. 


Remark 7.21. If asymmetric operator A fails to even be essentially self-adjoint, then there 
is either no self-adjoint extension of A or there are several. 


Definition. Let A be a densely defined operator. The defect indices of A are 
ds := dim(ker(A* — i)), d_ := dim(ker(A* + i)), 
where by A* + i we mean, of course, A* +1: idp,.. 


Theorem 7.22. A symmetric operator has a self-adjoint extension if its defect indices 
coincide. Otherwise, there exist no self-adjoint extension. 


Remark 7.23. We will later see that if d, = d_ = 0, then A is essentially self-adjoint. 


7.5 Criteria for self-adjointness and essential self-adjointness 


Theorem 7.24. A symmetric operator A is self-adjoint if (but not only if) 
4dze€C: ran(A+z)=H=ran(A+z2). 


Proof. Since A is symmetric, by Proposition 7.8, we have A C A*. Hence, it remains to be 
shown that A* C A. To that end, let ~ € Dax and let z € C. Clearly, 


A*w +20 EH. 
Now suppose that z satisfies the hypothesis of the theorem. Then, as ran(A + Z) = H, 
JaeD,g: A*W+2y = (A+Z)a. 


By using the symmetry of A, we have that, for any 0 € Da, 


Since 8 € Dy was arbitrary and ran(A + z) = H, we have 


VpEH: (ly) = (aly). 


Hence, by positive-definiteness of the inner product, we have w = a, thus w € Dy, and 
therefore, A* C A. O 


— 8&5 —-— 


Theorem 7.25. A symmetric operator A is essentially self-adjoint if, and only if, 
4dzeEC\R: ran(A+z) =H =ran(A+7). 


The following criterion for essential self-adjointness, which does require the calculation 
of A*, is equivalent to the previous result and, in some situations, it can be easier to check. 


Theorem 7.26. A symmetric operator A is essentially self-adjoint if, and only if, 
4dz€C\R: ker(A* +z) = {0} = ker(A* +7). 


Proof. We show that this is equivalent to the previous condition. Recall that if M is a 
linear subspace of H, then M+ is closed and hence, M+t+ = M (Proposition 4.7). Thus, 
by Proposition 7.5, we have 


ran(A +z) =ran(A+ z)++ =ker(A* +.z)+ 


and, similarly, 
ran(A +72) =ker(A* + z)+. 


+= Q , the above condition is equivalent to 


H tf 


Since 


ran(A + z) =H =ran(A+7). UO 


— 86 — 


8 Spectra and perturbation theory 


We will now focus on the spectra of operators and on the decomposition of the spectra of 
self-adjoint operators. The significance of spectra is that the axioms of quantum mechanics 
prescribe that the possible measurement values of an observable (which is, in particular, a 
self-adjoint operator) are those in the so-called spectrum of the operator. 


A common task in almost any quantum mechanical problem that you might wish to 
solve is to determine the spectrum of some observable. This is usually the Hamiltonian, 


or energy operator, since the time evolution of a quantum system is governed by the expo- 
nential of the Hamiltonian, which is more practically determined by first determining its 
spectrum. 

More often than not, it is not possible to determine the spectrum of an operator exactly 
(i.e. analytically). One then resorts to perturbation theory which consists in expressing the 
operator whose spectrum we want to determine as the sum of an operator whose spectrum 
can be determined analytically and another whose contribution is “small” in some sense to 
be made precise. 
8.1 Resolvent map and spectrum 
Definition. The resolvent map of an operator A is the map 

Ra: p(A) = L(H) 
2 (A=z)—, 
where L(H) = L(H,H) and p(A) is the resolvent set of A, defined as 
p(A) := {z €C | (A—z) 71 € L£(H)}. 

Remark 8.1. Checking whether a complex number z belongs to p(A) may seem like a daunt- 
ing task and, in general, it is. However, we will almost exclusively be interested in closed 


operators, and the closed graph theorem states that if A is closed, then (A — z)~! € L(H) 
if, and only if, A — z is bijective. 


Definition. The spectrum of an operator A is 0(A) :=C \ p(A). 
Definition. A complex number AC is said to be an eigenvalue of A: 4 if 
IwEDa\ {0}: A= dv. — ae 
Such an element w is called an eigenvector of A associated to the eigenvalue X. 
Corollary 8.2. Let \ € C be an eigenvalue of A. Then, A € a(A). 
Proof. If X is an eigenvalue of A, then there exists ~ € Dy \ {0} such that Aw = Ay), i.e. 
(A — A) = 0. 
Thus, ~ € ker(A — 2) and hence, since 7 4 0, we have 
ker(A — A) # {0}. 


This means that A — X is not injective, hence not invertible and thus, \ ¢ p(A). Then, by 
definition, \ € a(A). O 


_ Q7_ 


Remark 8.3. If H is finite-dimensional, then the converse of the above corollary holds ad 
hence, the spectrum coincides with the set of eigenvalues. However, in infinite-dimensional 
spaces, the spectrum of an operator contains more than just the eigenvalues of the operator. 


8.2 The spectrum of a self-adjoint operator 


RK 


PARAL Hef A REA Ne PAGE Sth MH REGRSE EL PRRT ELAGe thade4n dans Ah étatity,; 


we will primarily be interested in the case of self-adjoint operators. 
Definition. Let A be a self-adjoint operator. Then, we define 
(i) the pure point spectrum of A 


Opp(A) := {z €C | ran(A — z) =ran(A — z) # H} 


(ii) the point embedded in continuum spectrum of A 


Opec(A) := {z €C | ran(A — z) A ran(A — z) # H} 


(iii) the purely continuous spectrum of A 
Opc(A) := {z €C | ran(A — z) #ran(A — z) = H}. 
These form a partition of o(A), i.e. they are pairwise disjoint and their union is a(A). 
Definition. Let A be a self-adjoint operator. Then, we further define 


(i) the point spectrum of A 


Op(A) := Opp(A) U Opec(A) = {z € C | ran(A — z) 4 H} 


(ii) the continuous spectrum of A 


a0 (A):=o0 (A) o (A)= z C ran(A 2z)=ran(A 2). 
peur fe | fF 


Clearly, gp)(A) Uo¢(A) = o(A) but, since op(A) MN o¢(A) = Opec (A) is not necessarily 
empty, the point and continuous spectra do not form a partition of the spectrum in general. 


Lemma 8.4. Let A be self-adjoint and let X be an eigenvalue of A. Then, AX € R. 


Proof. Let ~ € Da \ {0} be an eigenvector of A associated to A. By self-adjointness of A, 


Awl) = (WlAdb) = (bl AW) = (Able) = Av|y) = AGH|H). 


Thus, we have 
(A— A)(p|y) = 0 
and since w < 0, it follows that \ = \. That is, \ € R. O 


— RQ _ 


Theorem 8.5. [fA is a self-adjoint operator, then the elements of a,(A) are precisely the 
eigenvalues of A. 


Proof. (<) Suppose that » is an eigenvalue of A. Then, by self-adjointness of A, 
{0} Aker(A — 4) = ker(A* — ) = ker((A — A)*) = ran(A — \)t = ran(A — A)+, 
where we made use of our previous lemma. Hence, we have 
ran(A — A) =ran(A—A)t+ 4 fo} =H 
and thus, A € op(A). 


(=) We now need to show that if A € op(A), then A is an eigenvalue of A. By contraposi- 
tion, suppose that A € C is not an eigenvalue of A. Note that if A is real, then A = A 
while if \ is not real, then A is not real. Hence, if \ is not an eigenvalue of A, then 
neither is \. Therefore, there exists no non-zero w in D4 such that Aw = AW. Thus, 
we have 


{0} =ker(A — A) = ker(A* — ) = ker((A — A)*) = ran(A — A)+ 


and hence 
ran(A — \) = ran(A — \)t+ = {0} =H. 
Therefore, A ¢ op(A). O 


Remark 8.6. The contrapositive of the statement P => Q is the statement ~Q => —P, where 
the symbol — denotes logical negation. A statement and its contrapositive are logically 
equivalent and “proof by contraposition’ simply means “proof of the contrapositive’. 


8.3 Perturbation theory for point spectra of self-adjoint operators 


Before we move on to perturbation theory, we will need some preliminary definitions. First, 
note that if ~ and y are both eigenvectors of an operator A associated to some eigenvalue 
A, then, for any z € C, the vector zw + y is either zero or it is again an eigenvector of A 


associated to A. 
Definition. Let A be an operator and let be an eigenvalue of A. 


(i) The eigenspace of A associated to A is 
Big 4(A) = { € Da | Ad = WW}. 


(ii) The eigenvalue X is said to be non-degenerate if dimEig ,4(A) = 1, and degenerate if 
dim Eig ,(A) > 1. 


(iti) The degeneracy of X is dim Eig ,(A). 


Remark 8.7. Of course, it is possible that dim Eig ,4(A) = oo in general. However, in this 
section, we will only consider operators whose eigenspaces are finite-dimensional. 


— QO _— 


Lemma 8.8. Figenvectors associated to distinct eigenvalues of a self-adjoint operator are 
orthogonal. 


Proof. Let , ’ be distinct eigenvalues of a self-adjoint operator A and let 7, yp € Da \ {0} 
be eigenvectors associated to A and 4’, respectively. As A is self-adjoint, we already know 
that A,’ RR. Then, note that 


Cc 


(A =X) (bly) = AIG) — A’"(Hly) 
= (Aly) — (lrg) 
(Ad|ie) — (Y|Ag) 
= (Wl|Agy) — (lAg) 
= 0. 
Since \ — \’ £0, we must have (|v) = 0. OU 


A. Unperturbed spectrum 


Let Ho be a self-adjoint operator whose eigenvalues and eigenvectors are known and satisfy 
Aoens = Rn€ns, 
where 
e the index n varies either over IN or some finite range 1,2,...,N 
e the real numbers h,, are the eigenvalues of Ho 
e the index 6 varies over the range 1,2,...,d(n), with d(m) := dim Eig 7, (hn) 


e for each fixed n, the set 
{ens € DH, |1 <6 <d(n)} 


is a linearly independent subset (in fact, a Hamel basis) of Eig 7, (hn). 


Note that, since we are assuming that all eigenspaces of Ho are finite-dimensional, Fig ;,, (hn) 


is a sub-Hilbert space of H and hence, for each fixed n, we can choose the e” so that 


(Cnaleng) = bap: 
In fact, thanks to our previous lemma, we can choose the eigenvectors of Hg so that 
(€nalems) = Onmoap- 


Let W: Dy, — H be a not necessarily self-adjoint operator. Let A € (—e,¢) CR and 
consider the real one-parameter family of operators {H) | A € (—¢,¢)}, where 


Ay := Ho + AW. 


Further assume that H) is self-adjoint for all AX € (—e,¢). Recall, however, that this 
assumption does not force W to be self-adjoint. 


—_ an — 


We seek to understand the eigenvalue equation for H), 


Ay eng (A) _ hng(A)ens(A), 
by exploiting the fact that it coincides with the eigenvalue equation for Hp when A» = 0. 
In particular, we will be interested in the lifting of the degeneracy of h, (for some fixed 


n) once the perturbation W is “switched on”, i.e. when \ 4 0. Indeed, it is possible, for 
instance, that while the two eigenvectors e,1 and enz are associated to the same (degenerate) 


eigenvalue h, of Ho, the “perturbed” eigenvectors e€n1(A) and en2(A) may be associated to 
different eigenvalues of Hy. Hence the reason why we added a 6-index to the eigenvalue in 
the above equation. Of course, when A = 0, we have hng(A) = hy for all 6. 
B. Formal power series ansatz)? 
In order to determine hys(X) and ens(A), we make, for both, the following ansatz 

hns(A) =: hn + 002 + 202 + O(3) 

€ng(A) =: Ens + re? + 2) + O(A3), 


where 0 oo ©) R and MY, <) eee 


Remark 8.9. Recall that the Big U aiaiion is defined as follows. If f and g are functions 
ICR—- Randa €/TJ, then we write 


to mean 
4dk,M>0:Vael: 0<|r-al<k = |f(x)| < Mlg(z)). 


The qualifier “as x — a” can be omitted when the value of a is clear from the context. In 
our expressions above, we obviously have “as \ —> 0”. 


C. Fixing phase and normalisation of perturbed eigenvectors 


Eigenvectors in a complex vector space are only defined up to a complex scalar or, alterna- 


tively, up to phase and magnitude. Hence, we impose the following conditions relating the 
perturbed eigenvalues and eigenvectors to the unperturbed ones. 


We require, for all X € (—e,¢), all n and all 6, 
(i) Im(englens(A)) = 0 
(it) Jlens(2 = 1. 
Inserting the formal power series ansatz into these conditions yields 
(i) Im(ensle) = 0 for k = 1,2,... 
(ii) O= 2A Re(ensle?) + \7(2 Re(engle?) + le ||?) + O(A3). 


'3German for “educated guess”. 


_ Qt — 


Since (ii) holds for all A € (—e,€), we must have 
Re(enslens) =0,  2Re(enslens) + [lena ll? = 0. 


Since we know from (i) that Im (engl?) = 0and Im (engl?) = 0, we can conclude 


(1) (2) i (1) 2 
(end|end) = 0, (end|end) = —2Ilend ||, 


That is, & is orthogonal to €ns5 and 


(2) 


€né6 =slle 


nb ll?ens + € 


for some e € span({ens})+. 


D. Order-by-order decomposition of the perturbed eigenvalue problem 


Let us insert our formal power series ansatz into the perturbed eigenvalue equation. On 
the left-hand side, we find 


Hyéns(A) = (Ho + AW) (Eng + AE“? + 2") + O13) 


= Hoens + (Weng + Hye) te “ewe + Hoe) + (3), 
O 


while, on the right-hand side, we have 
hms (A)ené(A) = (hn + AO) + 070 + O(A3)) (eng + ACY + A72e2) + O(A3)) 
= Pnéns + A(Pmed + 0 Dens) + A2(Ane? + Oe @) + 0 ens) + O(A). 


Comparing terms order-by-order yields 


(Ho — hn)e?) = —(W = 02) + 0@ens. 
Of course, one may continue this expansion up to the desired order. Note that the zeroth 


order equation is just our unperturbed eigenvalue equation. 
E. First-order correction 


To extract information from the first-order equation, let us project both sides onto the 
unperturbed eigenvectors €nq (i.e. apply (ena|-) to both sides). This yields 


(€nel(Ho — hn)e3) = —(nal(W — 0°2 Jens). 
By self-adjointness of Ho, we have 
(€na|(Ho — hn esp) = ((Ho - hn)" enale?) = ((Ho — hn)enal€o2) = 0. 


Therefore, 
0 = Ena Wens + Ena I eng = Ena Weng + 0) bas 


“i 2 4 i - = 


a9 


and thus, the first-order eigenvalue correction is 
1 
0) = (ens|Wens). 


Note that the right-hand side of the first-order equation is now completely known and 


(1) 


hence, if Hp —h, were invertible, we could determine e€, s immediately. However, this is only 


poe if the unperturbed eigenvalue h” is non-degenerate. More generally, we proceed as 
ollows. Let FE := Kigy,(hn). Then, we can rewrite the right-hand side of the first-order 


equation as 


| 


—(W - 6D )eng = —idgy(W — 0)ens 


= —(Py+ Ppi)(W — 0?)ens 


| 


= S(engl(W = 0°) ens) eng — Pei Weng + 9° Prea End 
=1 
= —Pru W eng 
so that we have (Ho — hn ery € E+. Note that the operator 
Py (Hy. fy)te> 2 
fe) a = 
is invertible. Hence, the equation 


Pit (Ho — hn)Ppiese = —PpiWens 


is solved by 
Priel) = Vis —Pr (Ho = hn) Pei Wens- 
(l D 


The “full” eigenvector correction €, s is given by 
d(n) 
id7 c) = = (PE + Pro ye) = S- C5penB — Pri (Ho — fig) Page Weng, 
p=1 


where the coefficients csg cannot be fully determined at this order in the perturbation. 


What we do know is that our previous fixing of the phase and normalisation of the perturbed 
eigenvectors implies that € ie is orthogonal to e€y5, and hence we must have c55 = 0. 
F. Second-order eigenvalue correction 


Here we will content ourselves with calculating the second-order correction to the eigen- 
values only, since that is the physically interesting formula. As before, we proceed by 
projecting both sides of the second-order equation onto an unperturbed eigenvector, this 
time specifically e,5. We find 


(ens|(Ho — hn )e) = —(ens|We oD) — 6°) (engled) + 02 (enslens): 
Noting, as before, that 
eng (Hy hin )e®) = 


fb ee Y 


od 


and recalling that (engl?) = Oand (enslens) = 1, we have 


0) = (ens|We). 


«) 


Plugging in our previous expression for € s yields 


d _— 
©) = o | W 3 cspeng —WPpr (Ho — hn) —— 
d 


n) 
=) / c5¢ (ens|Weng) — (ens|W Pp (Ho — hn) 'PgiWens) 
aA 
d(n) ; 
= Ss” C5805 S50 — (€n5|W Pp (Ho — hn) Pai Wens) 
B=1 


= —(ens|W Pp (Ho = hy) Pai Weng) 


since C55 = 0. One can show that the eigenvectors of Ho (or any other self-adjoint operator) 
form an orthonormal basis of H. In particular, this implies than we can decompose the 
identity operator on as 


H 
Se Desde EnB: 
n=1 B=1 


By inserting this appropriately into our previous expression for @ @ 2) we obtain 


CO d(m) 


9?) = yy | = leme|Wens)| 


a 


Putting everything together, we have the following second-order expansion of the perturbed 
eigenvalues 


hng(A) = hn + AO") + 762 + O(A3) 


= hin + Mens|Wens) — 2 3 [Kems|Wend)| + 93), 
ia ere 
meén 
Remark 8.10. Note that, while the first-order correction to the perturbed nd eigenvalue 
only depends on the unperturbed né eigenvalue and eigenvector, the second-order correction 
draws information from all the unperturbed eigenvalues and eigenvectors. Hence, if we try 
to approximate a relativistic system as a perturbation of a non-relativistic system, then the 
second-order corrections may be unreliable. 


OV A 


9 Case study: momentum operator 


We will now put the machinery developed so far to work by considering the so-called 
momentum operator for two cases: a compact interval [a,b] C R, and a circle. As the 
name suggests, this operator is meant to be the QM observable who’s eigenvalues are the 
momenta of the system. It is clear, therefore, that we require (recall = L?(R%) up to 
unitary equivalence) cal 
P: Dp — L?(R®) 

to be self adjoint. 

We will specialise to the case of d = 1 in order to simplify things, while also demon- 
strating the main ideas. The concepts can be extended to higher values of d. We will also 
set A = 1 throughout this section. 


9.1 The Momentum Operator 


Definition. The momentum operator is an operator P given by 


P: Dp > L?(R) 
» ( dW, 
=e 
where the prime indicates a derivative. 

The first obvious question is why does this deserve its name? (i.e. how is it related 
to what we know classically as the momentum?) The answer, unfortunately, can not yet 
be provided in full detail as it requires us to know the spectral theorem and Stone-von 
Neumann theorem. However these details are provided here as they will help later when 
discussing these theorems. For now we must just take it in faith. 

There is yet another important question we must ask: how do we choose Dp? The 
immediate response might be ‘such that the derivative is square integrable.’ However, this 
is not good enough. We also require that P be self adjoint and, as we have seen previously, 
the concept of self adjointness depends heavily on the domains considered. 

Luckily, not all hope is lost. The method will be as follows: guess a reasonable Dp and 


then search for a self adjoint extension, should one exist. Before doing so, though, we will 
first introduce some new definitions that will prove invaluable. 


9.2 Absolutely Continuous Fucntions and Sobolev Spaces 

During the calculations that follow we will naturally encounter three spaces: 
(i) The space of once-continuously differential functions over some interval, J; C'(1), 
(ii) The Sobolev space H!(J), and 


(iii) The space of absolutely continuous functions; AC (1). 


4Very unlike Dr. Schuller. 


As we shall see they are related via 
GU) CH) © ACW), 


and so they will provide a convenient way to compare the domains Dp, Dp, etc, to test 
for self adjointness. 


Definition. Let J CR. A function ~: I > C is called absolutely continuous if there exists 
a Lebesgue integrable function p: I — C such that 


ve) =4@)+ | olay 


for all compact subsets [a,x] C I 


Corollary 9.1. Given a absolutely continuous function, it is clear that p =q.e. ~’, where the 
almost everywhere condition comes from the fact that a Lebesque integral does not distinguish 
two elements that differ by a measure zero. 


Definition. The set of absolutely continuous functions is simply 
AC(I):=  L?(I) w is absolutely continuous . 


Definition. Let 2 C R be opane and a Q — C be Lebesgue iésauioble: w is called 
p-locally integrable if, for 1 <p <o, 


[ weer <x, 
K 
for all compact subsets K CQ. The set of all functions is 


LP 


loc 


(Q) := {v: Q > C| wv measurable, w| K € L?(K), VK CQ, K compact}. 
Remark 9.2. For case p = 1, we just call w locally integrable. 


Theorem 9.3. Every wy € L?(Q) for 1 < p < o is locally integrable. In other words 
TP(Q) c LL (Q). 


loc 


Definition. A function ~ € Licc(Q) is called weakly differentiable if there exists a p € 
Li,.(Q) such that 


[ v@¢(@ar=- | papas, 

Q Q 

for all yp € C9°(Q)!°. This function is known as the weak derivative of w and is denoted by 
pHv, 


Corollary 9.4. Note that for any weakly differentiable function the integration by parts 
result, 


[ v@¢ ar = - | v'@e(a)an, 
Q Q 
holds for all p € CS°(Q). 


‘©The subscript indicates that y vanishes at the limits of integration. 


Corollary 9.5. Given that wy €C*(Q), p € C(Q), we can show by induction that 
[v@e@ = Ca | vO @ve) 


where p(x) means the a-order derivative of y)(z). 


Remark 9.6. In the above Corollary we have used the fact that we are only considering 
one dimensional problems here. The expression is much the same for higher dimensional 
problems, however one has to take into account the different derivative directions. 


Definition. Let 2 C R be open, k € N and 1 <p < ow. The Sobolev Space is the space 
with set 
WP = {ph € LPO) NW*(Q) |p € LQ), Val < k}, 


where W* is the set of all locally integrable functions that also have weak derivates of order 
a for all jal < k. We introduce the notation H*(Q) := W*?(Q),. 


Remark 9.7. Sobolev spaces can be made into Banach spaces by equipping them with a 
norm and H*(Q) can be made into a Hilbert space. 


Proposition 9.8. We can rewrite the space H1(Q) as 
H*(Q) = {p € AC(Q) |p" € L*(Q)}, 
where yw denotes the normal notion of derivative. 
Proof. See Theorem 7.13 in ‘A first Course in Sobolev Spaces, Giovanni Leoni’. OU 


9.3 Momentum Operator on a Compact Interval 


Let’s now consider the case where the physical space is some compact interval in R (i.e. 
we have a particle moving along the bottom of a well). W.lo.g. take [0,27] =: J, the 
justification for which will is clear from the fact that next we will consider a circle. 

We now need to come up with a reasonable guess for the domain Dp. First recall 


P: Dp — L?(I) 
pre (—i)y". 


It is therefore reasonable to restrict ourselves to w € C!(I). Equally, physically we expect 
the function to vanish at the boundaries (the walls of the well). This gives us our first guess 


Dp = {W €C'(D) |W(0) = 0 = W(2n)} =: CLD. 


The question still remains, though, as to whether P is self adjoint. Recalling the results 
of Lecture 7, it is first instructive to see if P is symmetric. 


A. Symmetric? 


Let w,p € Dp, then 


2 
= | da(—yP"(a) p(x) 
= (Py, ), 


where integration by parts was used. So, yes P is symmetric. 


B. Self Adjoint? 


From above we know that P C P*, and so Dp C Dp«, so we need to ask the question of how 


P* behaves outside the domain Dp. The obvious answer is to just extend the definition to 
be 


*: pe  L(D) 
PP bs (iy 
Note that the w here is not necessarily the same as the w in above. The same symbol is 


just used and the context tells us where it lives. 
All that is left to check is the domain Dp. From the definition of the adjoint we have 


yp € Dp» => An € L*(1): Ve € Dp: (W, Py) = (n, 9), 


with 7 := P*w. Before proceeding further with the calculation first introduce a function 
N: I + C such that 7 =g., N’. Note N is Lebesgue integrable and that the almost 
everywhere condition is sufficient as 7 appears in a Lebesgue integral. Therefore we have, 


7 7 
dxp(x)( i)p'(z)=  — dz N'(x) (a) 


which tells us that 
w—iN € {p'|p € Dp}. 


This does not appear to have got us any closer to determining the domain Dp«. However 
consider the following two Lemmas. 


Lemma 9.9. {y'|y € Dp} = {€ €C%(D)| 67 E(a)dx = OF. 


i 


Proof, Let A := {y'|y € Dp} and B:= {€ € C%(D)| fo" €(x)dx = 0}. 
Now consider a y’ € A, then 


Qn 
| ol (ade = [p(x)]2" = 0, 


so clearly€:= yandA_ B. 
Now consider a € € @ and define 


Then, since € € C°(J) it follows that y € C'(J). It also follows that y(0) = 0 = y(27) and 
soy’€ AandB CA. Oo 


Lemma 9.10. Let {1} denote the set consisting of the element 1 € L?(I) with 1(x) = 1c 
for all x EI. Then 


{y' |e € Dp} = {I}. 


Proof. From the previous Lemma we have 


{y' |p € Dp} = {€ € CO(D) | (1, €) = OF 
={oe C0) ie) = 0} 
= {€ € L°(I)|(1,€) = 0} 
= {1}, 
where the fact that C°(I) is dense in L?(I) to go from the second to third line. O 


Putting this all together we have 


| 


={C:I >Cl|lz Ce}, 
where C'c is a constant in C. Recalling that N is Lebesgue integrable we see that 
w(z) =Ccot+iN(az) € AC (I), 
and so 
Dp» C AC (I). 

Now recalling 

P*: Dp» + L?(I) 

pb  ( ay, 


ro 


and using Proposition 9.8, we have 
Dp» C H}(I). 
Finally we see that because all of the integration by parts results above were of the form 
dxy'(x)p(z)= v(x)y"(2), 


I I 
for arbitrary y € C}(J). basis itive GUO the integrals also hold for any y € 


Cc 
C(I), but this is just the condition for weak dérivative and so we see that 


Dp» = H'(1), 
and 
Dp & Dp, 
so P is not self adjoint. 
C. Essentially Self Adjoint? 


We have managed to show that our initial guess for P is not self adjoint. The next step 
is to ask if there is a self adjoint extension and if this extension is unique. Recall that a 


PBI RTE AAFOHES PWS OHO EMSIRS RENEW RS bo eessen tially self adjoint (Le. 


yeDs => Vee Dp«: (wb, P*—p) = (Py, ¢). 
Now, recall that for a symmetric operator P C P C P* so it’s clear that 


(Pd, p) = (P*v, ) 


in the above. Writing as integrals we have 


7 20 
/ dip (2)! (ee) = / dP (a) pz) 
27 : : 


i de B(x )e (a) Boe (a) = 1 Ba)ela) ? 
[ 7 ) = Fem) g(2n] — B(0)9(0), 


where again integration by parts has been used. We need to be careful in what conclusions 


27 


we draw from this final statement, though. y € Dp« = H1(1), which places no restrictions 
on the values of y on the boundary, nor does it make any conditions between the two values 
y(0) and y(27) — they are independently arbitrary. We must, therefore, conclude that 


(2) = b(2r) =0=H(0) = v0), 
and so, at best, 
Dp = {bh € H*(1)|p(27) =0=(0)} S Dp», 
and so P 4 P*, which, after taking the adjoint of both sides, tells us that P A P, and so 
is not even essentially self adjoint. 


P 


= 00 


D. Defect Indices 


We only have one tool left to check for a self adjoint extension of P, check the defect indices 
to see if a self adjoint extension even exists. Recall 


d,:=dim_ ker(P* —i) , d_:=dim ker(P* +i) , 


and a symmetric operator has a (not ohn erie unique) ea adjoint oyeeen dad, 
We therefore need to determine how many ~ € Dp lie in ker(P* + 7): 


(P* = i)y=0 
—i’ Fiv = 0 
p(x) =aze™ 


for a,,a— € C. There is only one solution for each and so d, = 1 = d_. We therefore know 
that there does exist at least one self adjoint extension of P!®, however we don’t know the 


form of any of them.!” 


Remark 9.11. If instead of a compact interval we take a half line J = [a, oo), thend, 4 d_ 
and so there is no self adjoint extension of , meaning there is no notion of aQM momentum 
in this case. Note however that people offén talk about free particles along an infinite line 
in QM, however they always require the wave function (7) to vanish at +oo. This is clearly 
just the same as taking a large, yet finite, compact interval I = [a, )]. 


9.4 Momentum Operator on a Circle 


We now want to repeat all of the above but for a circle instead of a finite line segment. 
Fortunately almost all the work is done, the only slight difference is in the definition of Dp: 


Dp = { €C'(1) | ¥2r) = 4(0)}, 


which is exactly the same as before apart from now we do not require w to vanish at the 
boundary. We still have 


P: Dp — L?(k) 
pre (iy, 
so it follows that Py; G P., where the J and c denote interval and circle respectively. In 
other words ?, is an extension of Py. 


A. Symmetric? 


Repeating the steps from above it is clear that P is still symmetric. Note however, it is 
symmetric for a different reason: before we had [w(x)p(x)| = 0 as both w and y vanished 
at the limits, whereas now it holds simply because w (27) = (0) and likewise for y. 


16 Whew! 
17Not whew! 


=10l= 


B. Self Adjoint? 


As before we have 


weDps => Vp € Dp, : (W, Pey) = (Per, ¢)- 


However, recalling that the adjoint flips the inequality sign we have * * and therefore 


Dp» CH 1(I). We can, therefore, replace the unknown P* with the Rn&wr'P* in the final 
part of the above, i.e. 


(Po, p) = (Pry, £)- 


Then, following the exactly as before, we arrive at 
2 
O=% [dy(z)| ‘a 


= [b(2r) — ¥(0)] (0) 
= (27) = (0), 


giving us!® 
Dps = {b € H*(1)|b(2r) = (0)} =: He (D), 


* 


so DP’ ¢ DP*, and therefore Pe is not self adjoint. 
C. Essentially Self Adjoint? 


Again, as before, we have P. C P. C Pz and 
be De => Ve € Des: (Wh, Pop) = (Poh, p) = (Ped, 9), 


which results in 


and so 


so we conclude that P, is essentially self adjoint and P.. is the unique self adjoint extension. 
To summarise, we have found the momentum operator on a circle: 


Poi: Hoy(L) + L?(R) 
pr (-i)y 


'8Note we are OK extend the domain to all of H‘(I) provided we impose the conditions above. 


Si 


10 Inverse spectral theorem 


This section is devoted to the development of all the notions and results necessary to 
understand and prove the spectral theorem, stated below. 


Theorem 10.1 (Spectral theorem). For every self-adjoint operator A: Da > H there is a 


unique projection-valued measure PA: (OR) — L(H) such that 


A= [i dP, = [arate 
IR IR 


where ip: R& C is the inclusion of R into C. 


While useful in theory, existence results are often of limited use in practice since they 
usually only tell us that something exists, and not how to construct it. However, we should 
note here that the proof of the spectral theorem is, in fact, constructive in nature. Hence, 
given any self-adjoint operator A, we will be able to explicitly determine its associated 
projection-valued measure P, along the following steps. 


(i) For each w € H, construct the real-valued Borel measure ie o(Or) > R given by 


A+6 
wy ((—00,A]) := Jim lim ff dtIm(y|Ra(t + ie)¥), 
where Ry: p(A) — L(H) is the resolvent map of A. This is know as the Stieltjes 
inversion formula. Note that while not every element in o(Op) is of the form (—co, A], 
such Borel measurable sets do generate the entire o(Op) via unions, intersections and 
set differences. Hence, the value of pi (Q) for Q € o(Op) can be determined by 
applying the corresponding formulae for measures, namely o-additivity, continuity 
from above and measure of set differences. 


(ii) For all, yp € H, define the complex-valued Borel measure ee a(Or) > C by 


Hep p(Q) = 4 (Wi y(Q) — Up p(O) + img ig (Q) — ine yip(Q)). 
(iii) Define the projection-valued measure P4: o(Or) > L(H) by requiring P4(Q), for 
each Q € a(Op), to be the unique map in L(H) satisfying 


Ve peEH: (IPs(Q)y) = [ x0 dud ». 


We will now make all the notions and constructions used herein precise. In fact, we 
will present the relevant definitions and results by taking the inverse route, starting with 
projection-valued measures and arriving at their associated self-adjoint operators, obtaining 
(and proving) what we will call the inverse spectral theorem. 


= 108 = 


10.1 Projection-valued measures 


Projection-valued measures are, unsurprisingly, objects sharing characteristics of both mea- 
sures and projection operators. 


Definition. A map P: o(Or) > L(H) is called a projection-valued measure if it satisfies 


the following properties. 
(i) VQ €o(Or): P(Q)* = P(Q) 


(ii) 
(iii) P(R) = idy 
) 


(iv) For any pairwise disjoint sequence {Q,}nen in o(Op) and any ~ € H, 


VQ €a(Or): P(Q)o P(Q) = P(A) 


SS P(On)¥ = P(U nv 


Remark 10.2. Note in the final condition we included w € H as, for the case countably 
infinitem WN we need to check convergence, which involves using the norm. Without the w 
we would @eed to use the norm on £(H) which may prove difficult. However, by including 
the w we can work with the norm on H itself. 


Lemma 10.3. Let P: o(Or) — L(H) be a projection-valued measure. Then, for any 
0,04, 02 € a(Or), 


(i) P(@) = 0, where by 0 we mean 0 € L(H) 
(ii) P(R\ ®) =idy — P(O) 

(iti) P(Q, UN) = P(Q1) + P(Mg) — P(QL NAD2) 
(iv) P(Q1 Ag) = P(M1) o P(M2) 


(v) ifQ, Qa, then ran(P(Q,)) — ran(P(Q2)). 
Proof. Let & Q1,Q2 € a(Op), S 
(i) 
P(o)) = P(@U 2) =(P(2)+P())b = 2P(a) 
- P(@)y = 04 =— P(S) =0. 
(ii) 
P(R) = P((R\ Q) UQ) = P(R\ 9) + PQ) 
P(R\ Q) = idy —P(Q). 


where we used the fact that (R Q) Q=2. 
\ on 


— 104 — 


(iii) 
P(Q)) — PU a} OQ) U (Qy \ Q)) — P((Qq M Q2)) + P((Q4 \ Q)), 


and similarly for P(Q2). Also 
P(Q, \ Q2) + P(Q2 \ 21) + P(O1 NM Og) — i (Qy \ Qe) U (Q2 \ 1) U (Qy N Qa) 
= oe U2). 
Putting this all together gives the result. 


(iv) First consider Q] 9 Q2 = @. Then, using (ii) from the definition we have 


[P(Q, U 2)? = [P(Q1) + P(Q2)]? 


P(Q] U Q2) = P(Q1) + P(Q2) + P(Q1) o P(Q2) + P(Q2) o P(O1) 
7, P(Q)) o P(Q2) = —P(Q2) o P(O1) 
==> P(Q1) o P(Q2) o P(Q2) = —P(Q2) o P(Q1) o P(Q2) 
P(Q1) o P(Q2) = P(Q2) o P(Q2) o P(Q1) 
P(Q 1) o P(Q2) = P(Q2) o P(OQ1) 
7. P(Q1) o P(Q2) =0 VYOIN 02 = g. 


Now from PQ) — P((Qy ‘ Qo) U (Qy M Q2)) = PUN \ OQ) + P(Qy M Qo), we have 


P(Q1) fe) P(Qg) = [Pig \ Qo) + P(Q, ‘a Q2))] fe) [P(Q2 \ Q;) + PQ, ‘a Q2))] 
QVM OQ.) fe) P(Q, M OQ) 


where we have made use of the fact that (Q1 \ Qe) A (Qe \ 1) = S ete. 
(v) If Q, C Qe we have 
P(Q2) = P (Q2 \ 01) UOr) = P(Q2 \ O1) + P(Q1), 
which along with the fact a P(Q) > 0 gives the result. 
O 


Note most of these properties make sense simply by thinking of P(Q) as the area of 
the set Q € o(Op). For example P(Q1) = P(Q, \ Q2) + P(Q1,.N Nz) is 
Q1 


Q Qo Qo 


=105= 


Remark 10.4. As noted before, it suffices to know P((—o0, d)) for all A € R, and for this 
reason a new notation is introduced; 


P(A) := P((—0o, Al). 


However, as they are written they appear to have different domains. We therefore give a 
new name to P(A); the resolution of the identity. 


10.2 Real and Complex Valued Borel Measures Induced by a PVM 
Definition. For all ~,~ € H we define the C-valued measure 
Myo: T(Or) 2 C 
4 pryye(O) = (b, P(e). 
Definition. For all 7 € H we define the R-valued measure 
Hap -= Map,rp 
Proof. that py(Q) ER. 
py (Q) = (, P(Q)Y) 
= PQ), % 
= th, PO)Y) 
= pag); 
where we have used the fact that P is self adjoint to go from the first to the second line. O 
10.3 Integration With Respect to a PVM 


We now wish to make sense of the operator {, fdP for measurable f : R— C. We will 
build this up in three steps: 


(i) For simple f, 
(ii) For bounded f, and 
(iii) For not necessarily bounded! f. 


Remark 10.5. As we shall see, if f is bounded (in the sense that there exists aa € R such 
that |f(a«)| < a for all x € R) that the integral part of the operator will have nice properties. 
For example it will be linear in f, i.e. 


[tot+ gar =a f far+ [ oar, 


for a €C. However, if f is unbounded, domain issues will destroy the equality above. It is 
important to note, though, that it is exactly the latter case we need as 


f=irp: ROC 
vt x 


in the Spectral theorem, which is clearly unbounded. 


Recall footnote 5 from lecture 2: to us the term ‘unbounded’ means definitely not bounded. 


= 106 = 


A. Simple f 


Recall a simple function is a measurable function that takes a finite number of results in 
the target, i.e. if f: RC then 


for some N € N. This allows us to rewrite f as 


N 
f= S- fnXQn 
n=1 


where 2, := preim({fn}) and x is the characteristic function. 


Definition. For simple f : R— C and PVM P we define 


N 
[ faP = fuPOn) 
n=1 


Proposition 10.6. For simple f, pp fdP is linear in f. 


Proof. Let S(R,C) denote the set of-all simple functions f :R —C. We can make this set 
into a C-vector space by inheriting the addition and s-multiplication from C, namely define 


(f+ g9)(@):=fla)+g(z), (a: f(x) =a: fla), 


for all f,g € S(R,C), a €C. 
Now consider the preimage part: As f and g are both simple it follows that 


(f+ g)(x) =fnt+9n 
for some fn, gn € C, so the preimage term becomes 
preim pi9{fn + gn} ={z} = preimy{ fn} 


= preimg{g}. 
It follows trivially, then, that 


[us gdp = | sap+ | gap 


A similar method gives the a € C condition of linearity. LW 


Remark 10.7. Observe that yg for any Q € a(OR) is simple (it only takes the values 0 or 
1), and hence 


[ xoaP = 1. P(Q) +0- P(o \ 2) = P(A). 
R 


LOT = 


Remark 10.8. Observe also that for any w,p € H, 


(es ( [ sar) °) (w, s fnP(On)e) 


N 


| 


| 


=| ( ) 
= (te 


n=1 
= | fate 
R 


Definition. For simple f we can define the map 


( [ar : S(R,C) > L(H) 
fro ee 


where Proposition 6.7 was used. 


which, if we equip S(R,C) with the suppremum aa and £L(H) with its operator norm, 
has operator norm || {,dP\| = 1. 


Proof. We have already shown that J, fdP € L(H) (ie. it is linear), so we just need to 
show the norm condition. First, let f € S(R,C) and w € H, then 


IL], = (Leen) (2) *) 


N N 
= (> fnP(OQn)Y, S- fnP(Qm) ) 
m=1 


n=1 


N 


yp, FradmP(Qn)P(Qm)P 


n,m=1 


ty FafimdnmP (Om) ) 


nym=1 
N 


= Sof, P(Qn)d) 


n=1 


| 


| 


N 
= Ss” |fP Hy(Qn) 


n=1 
| If Pape 
R 


J |[Flloolllla, 


| 


ae 
A 
Sy 
Qu 
v 
NN 2 
I 
= 
/\ 


=108= 


where we have used the definition of the norm in terms of the inner product, the fact 
that P is self adjoint, the fact that P(Q,)P(Qm) = dnm for pairwise disjoint Q7/Qm, and 
Proposition 6.27 along with the fact that ||f||.o := supzcr f(z). The equality in the last 
line can be assumed provided f and w are sufficiently chosen. 

Thus we have 


_ fdP, cw 
| LP cee Re 


feES(R,C) 
_ Se faP) vl ,, 
~ sesircyyer [flloolllla 


= 


B. Bounded Borel Functions 


Definition. The set of all bounded, measurable functions is denoted 
B(R,C) := {f: R > C | measurable, || flo. < co}. 
Proposition 10.9. The set B can be made into a Banach space by defining the norm 


|| fll == sup | f(x)]. 
zeER 


Proof. We turn the set into a linear vector space in the usual manner; we inherit the addition 
and s-multiplication from C. 

Now prove ||f||g is a norm. Comparing to the definition given at the bottom of Page 
9, for f,g € B(R,C) and z €C we have 


(i) Clearly |[fllx = 0. 


(ii) 


so f = 0. 
(iii) 
|| - fla = sup |z- f(x)| 


xzeER 


= |2| sup |f(x)| 
zEIR 


=: |Z] - | flls. 


= 100-5 


f+ gle = oD l(f + 9)(2)| 
= sup | f(z) + g(2)| 
sup f(z) +sup g(z) 


Lec 4 


zER | LE 
|lflla+ llglle- 


ILA 


Now let {fn}nen be a Cauchy sequence in B(R, C), that is: Ve >0,4N EN: Vm,n >N 
we have 


U( fas fm) *= \fn — foil = sup |fn(x) — fm(x)| < €. 
LE 
Now from the definition of the supremum we have 


lfn(&) — fm(@)| < If — alle, 


so it follows that the sequence {fn(xz)}nen is a Cauchy sequence in C. But C is a complete 
metric space so we know that this Cauchy sequence converges in C, i.e. 


nlimao fr(x) = ze EC. 
We can thus define a point-wise limit f of the sequence {fn}nen C B(R, C) as 


f(a) := lim fn(2) = Ze 


LOO 


for alla € R. Then, by equipping C and R with their respective Borel o-algebras, Proposi- 
tion 5.22 tells us that f is measurable. 

Finally from the fact that B(R,C) C L(R,C), Theorem 2.8 tells us that f is bounded 
and so B(R, C) is a Banach space. L 


Corollary 10.10. Observe that S(R, C) is in-fact a dense, linear subspace of B(R,C). Thus, 
the BLT theorem tells us that we have a unique extension of the operator 


dP. : S(R,C) > L(H) 
to the domain B(R,C) with ees norm. That is, we have an operator 


( [ ar): 50,0) + 00% 


with |) fg dP|| =1. 


Corollary 10.11. By suitable definition we can turn the space B(R,C) into an C*-algebra 
and our operator then has the following properties 


(i) 
1ldP = idy 
R 


/ 


=110— 


(it) 


(iii) 
fdP = fdP 
IR R 


These properties collectively make thefoperator { f° -alge}ra homomorphism. 


C. General Borel Function 


We now want to allow for the case that f is unbounded. We will write the following such 
that it reduces to the above when f is bounded. 


Definition. Let f : R — C be measurable, then we define the linear map 
( [ sar): Dj, sap > 


fefaP'= f Pdpy < 


where 


D EH, oo, CH 
is a dense, linear subspace. The a map ‘| ‘leftned via ~ 


( [sar )v = tim. |( [sua] 


where the sequence {fn}nen C B(R, C) defined by 


fnri= X{eeR|f(x)<n}J- 


Remark 10.12. Note that the map above includes the f, it is not just the integral defined 
in the previous section. Note also for the case when f € B(R,C) we just recover the case 
above and we have Dp sap = H (i.e. we have £L(H)). Otherwise it is a proper subset. 


Remark 10.18. The literature often introduces the notation 


Df = Dy. faP> 


however this could lead one to think of the domain of f itself, which here is R. We will 


avoid this notation. 


Remark 10.14. The sequence { fn }nen can be thought of as ‘chopping’ f into bounded parts. 


, FA fa 


aL 


Lemma 10.15. The sequence {fn}nen is Cauchy in L?(R). This in tern implies that the 
sequence {( fp fndP)v}nen is Cauchy in H, which is required for the limit in the definition 
to make sense, i.e. the result lies in H. 


For this general case of a measurable f we have: 


[Far - ( [ sar) 


(ii) For a € C and f,g measurable, 


(a [ sap+ f[ oar) c fart g) dP, 


where the equality holds only for bounded f. As explained earlier, the inequality 


(i) As before 


arises due to domain issues. We can now see this more explicitly from the defini- 
tion of Dy fap; Just because f and g are both measurable, it does not mean that 
their respective map domains will coincide. However, the domain for the LHS is 


P f(laf\+lgl)aP- 
(iii) 
( faP) : @ oP) cf (f-gar 
R R R 
again where the equality holds only when f and g are bounded. 


10.4 The Inverse Spectral Theorem 


We are now in a place where we can understand the inverse spectral theorem. 


Definition. Given a PVM, P, we can construct a self adjoint operator Ap as 


Ap = | iagar, 
IN 


where id®: R<> C is the inclusion map. 
Proof. 


(Ap)* = | icear 
IR 


= | idp dP 
IR 


= Ap, 


so Ap is self adjoint. O 


= i112 = 


11 Spectral theorem 


The inverse spectral theorem tells us how to construct a self adjoint operator Ap given a 
projection valued measure P. The aim of this lecture is to do the opposite; given a self 
adjoint operator A we want to find a PVM Py. We want the two methods to be in unison 
— that is we want Ap, = A and P4, = P. We shall start by assuming that A can be 
written in integral form, and then shall remove this restriction. 


11.1 Measurable Function Applied To A Spectrally Decomposable Self Adjoint 
Operator 


The term spectrally decomposable means that is has integral form. 


Definition. Let the self adjoint operator A be spectrally decomposable; i.e. there exists a 


PVM P such that 
A= i. idp dP, 
R 


then for any measurable function f : R — C, we define the operator 


F(A): Dy pap > H, 


given by 


f(A) := i. (f oidp)dP = [ f)P(AD). 


Remark 11.1. The spectral theorem will show that every self adjoint operator A is spectrally 
decomposable by virtue of a uniquely detemerined P. 


Corollary 11.2. If f: RR, then f(A) is again self adjoint. 
Proof. 


Let us now consider two important examples. 


Example 11.38. Let A = fp AP(dA) be self adjoint. Then 


exp(A) := | e*P(dd) 
R 
is self adjoint due to the previous Corollary. However 


exp(iA):= e’*P(dA) 
IR 


| 


res Us a 


is not self adjoint. This latter case is of high importance in QM, as can be seen by revisiting 


Axiom 4 at the start. 
Example 11.4. Let A = J, AP(d)) be self adjoint and Q € (Or). Then 


PQ)= — xodP 
R 


implies / 


which is one of the projection conditions. 
11.2 Reconstruct PVM From a Spectrally Decomposable, Self Adjoint Opera- 


tor 
The key to reconstructing the associated PVM P is to consider the resolvents. 


Definition. Given a spectrally decomposable operator A and an the resolvant set p(A), 


we define 


which is rewritten as 


and, due to the fact that A is spectrally decomposable, satisfies 
r,(A)= (rz, cidp)dP= , + P(d)). 


7 is Nz 
Note, using the results in the’previous lecture,’we have that for any w € H, 
(ds Rate) = (wh ( f(r. oide)aP )w) 
R 
= ae oidp)dpy 
R 
=f SH) 


Definition. A Herglotz function is an analytic complex function that maps the upper 
half plane into itself, but need not be surjective or injective. They are also known as 


Nevanlinna/Pick/R functions. 


—114- 


Theorem 11.5. The function 


(b, Ra()y): CC 


is Herglotz. 
Proof. Recall 


is real-valued. Then using 


we have 


m (, Ra(z)p) 


l| 
>— 
— 
5 
——N 
~~ 
—_ 
NNW, 
x 
= 
a 
cat 


II 
Dole 
a 
| | 
~ 

| e 
& | 
| 
~ 
— 
| 
— 
S 
e 
~~ 
Q. 
~ 
—*" 


| 
i) 

= 

& 
“~~ 
Q 
»~ 
YS 


Then, since the fact that the integral is Lebesgue and the intergrand is non-negative and 
so 
m(wy, Ra(z)b) >0O <= Im(z)>0. 


OC 

Recalling the start of last lecture, if we can find a way to construct j1, from our A then 

we can use that to reconstruct P. The result of this is the previously mentioned Stieltjes 
Inversion Formula, and it is obtained as follows. 


Let t,¢ € R. Then, since A is self adjoint and so its spectrum is purely real, t+ie € p(A). 
This allows us to act on it with Ry. Thus, consider 


a, & 7” x 
lim a) dt Im(w, Ra(t + ie)w) = lim + fa Dieser jet) 


& 230+ 1 


~e GL ne 


where Fubini’s Theorem? has been used. The inner integral is a standard integral, with 


result 
1 “ rr e 1 f t—r\]” 
- = —| arctan : 
Tie (A= see oe th 


Now strictly, at this stage, we cannot simply pull the € limit into this expression; we would 


need to check that the above result is bounded first and then, by dominated convergence, 


20See Wiki 


=115— 


we can pull it in. This will turn out to be true, and so in order to simplify the following we 
consider ¢ to be small here. 

In order to work out the above expression, we can use the A-graphs. Let’s plot both 
terms (including the overall minus sign that comes with t;) on the same graph: 


Nh 


—tarctan “> 


) 


arctan (B=) 


Nile 


Taking the limit and adding gives 


ty tg a 


So we have 
i” —_* . i 
clipe an 5, HOOP +e = 3 xt) + xn, 
and 


.. lf? 1 
lim a dt Im(w, Ra(t + te)p) = Bf Ccestay + Xie) (A). 


e>0t 7 ty 


Finally, we have the Stieltjes Inversion Formula. 


Theorem 11.6 (Stieltjes Inversion Formula). Given a spectrally decomposable, self adjoint 
operator A and its associated resolvent map Ra, we can construct a real-valued measure 


1 A+6 
14; ((—0o, A]) = lim lim =| dt Im(, Ra(t + ie)p). 


60+ e>0+ T Joy 


—116- 


Proof. 
Sa dies ce au dl 
FL ae ee / 7 dtIm(), Ra(t + te)y) = lim, _ (X(-00.r48) + X(—00,5]) oy (AA) 
=f xooanew() 
= pv (—oo, AJ), 


where we used the fact that the y(Q) is bounded to move the limit inside the integral along 
with the fact that 


lim (—00, A +6) = (—00, I. 
jim (—00, A +8) = (—00,A] 


O 


Remark 11.7. Note the fact that (W, Ra(t+ie)w) is Herglotz with the fact that ¢ > 0 gives 
us that we > 0, which is required for it to be a real-valued measure. 


Remark 11.8. If we already know that A is spectrally decomposable w.r.t. some PVM P, 
then we can recover P from A by virtue of: for any Q € o(Op) and for ally,y EH 


~PQ) = xoduyy, 
R 
where [y,y is obtained from /1., oe the etn given at the start of the previous lecture. 


11.3 Construction Of PVM From A Self Adjoint Operator 


We need to free ourselves from the fact that A is known to be spectrally decomposable 
from the start. We could do this by trying to recreate the above method, i.e. arrive at the 
Stieltjes Inversion Formula for an operator A, by showing that 


(i) (w, Ra(-)w) : C > C is Herglotz for any self adjoint A 
(ii) (bh, PQ)y) := fe xadpy,y is indeed a PVM. 
In order to prove these we first need a new theorem. 


Theorem 11.9 (First-Resolvent Formula). For any operator A: 4 anda,b  p(A) 
we have D-H € 
Ra(a) — Ra(b) = (a— b)Ra(a)Ra(0) = (a — 6) Ra(b)Ra(a). 


Proof. Consider 


Ra(a) — (a — 6)Ra(a)Ra(b) := (A—a)7! — (a—b)(A—a) (A) 
= (A—a)7'[idy —(a — 6)(A—b)™"] 
(A —a)7"[idy —(a—A+A-—b)(A—d)7"] 
(A —a)7"[idy+(A — a)(A—b)71 — (A—b)(A—8)7] 
= (A-b)* 
= Ra(d), 


and similarly for the the other result. O 


= 1ty-= 


The proof of (4) and (ii) above was on a problem sheet. Return and do this later. 
Conclusion, the Spectral Theorem, Theorem 10.1, together with the recipe for the 
construction of the PVM P from a self adjoint operator A holds. 


11.4 Commuting Operators 


The study of QM is teaming with so called commutators. However, they are not as simply 
defined as is often erroneously assumed. In particular, the commutator between the position 


and momentum given by 
ig P;| a ind’; 
is not even defined, unless further provisions are given. This formula appears in the opening 
sections of almost all QM textbooks though, and so it’s important we understand what is 
meant. 
One can happily write the commutator provided the operators involved are bounded 
(which from the lecture 9 we see that at least P; is not). 


Definition. Let B,, By € L(H), i.e. they are bounded linear operators from H to H. Then 
one may define 
[Bi1,Bo]):=B, Bo Bo Br, 


[@) = [e) 
where 
(Bi, Bo] € LH): 
Remark 11.10. The tuple (L(H), +, :, [-,-]) is a Lie algebra?!. 


Corollary 11.11. Let A,B,C € L(H) then the following holds 
[Ao B,C] = |[A,C]o B+ Ao [B,C]. 
Proof. Consider the action on some arbitrary w € H. 


[Ao B, Cl := (Ao B)o (Cy) —C 0 (Aco By) 

= Ao(BoCy) —Co(Ao By) 

= Ao(CoBw+[B, Clw) —C 0(Ao By) 

= (AoC)o BW + Ao[B,C]wW —C 0(Ao By) 
(CoA+[A,C]) 0 BY + Ao[B, Cl] —C 0(Ao By) 
[A,C]o BW + Ao [B, Cl~ 
= ([A,C]oB+ Ao|B, C))y, 


where we used the associativity of the composition of maps. Then, as ~ was arbitrary, we 
have our result. 0 


Corollary 11.12. Let A and B be two operators. Then if one of the them is unbounded 
the domain Dia,p) may only have a trivial definition, 1.e. Dia.) = {On}- 


21 See Dr. Schuller’s Lectures on the Geometric Anatomy of Theoretical Physics 


Sls = 


Proof. Let A: D4 —+ H be unbounded with Dy C H, and define the bounded operator 
Bg ttt +H 
ar? (Y, ayy =: Ly(a)w 


for some fixed y, w where w / 4. So we have ran(B,) = A. Then from the first 
term in EH ED H\D 


[A, By] :== Ao By — BooA, 


it follows that 
DiA,By] = ran(B,) NAD,= {041}. 


Definition. Two bounded linear operators A, B € L(H) are said to commute if 
[A, B] = 0. 


Corollary 11.13. Let A,B € L(H) be commuting operators. Then if A is also non- 
degenerate then any w that is an eigenvector of A is also an eigenvector of B. In other 
words, the set of A’s eigenvectors is contained within the set of B’s. 


Proof. Let w € D \ {0} be an eigenvector of A with eigenvalue X € C. Then we have 


[A, Bly := (AoB-—BoA)y 
= A(B)) ~ B(Ad) 
= A(BYy) — BUY), 


where we have used the fact that B is linear. Then from the fact that | A, B] = 0 it follows 
that Bw is also an eigenvalue of A with eigenvalue . Finally from the fact that A is 
non-degenerate it follows that By = ww must hold for some p € C. 0 


However, as highlighted at the start of this section, we also want to look at situations 
when one of the operators may not be bounded. In other words we want to know how to 
extend the idea of commuting to 


(i) A self adjoint and not necessarily bounded, B bounded. 
(ii) Both A and B self adjoint and not necessarily bounded. 


As is often the case in maths/physics problems, the strategy is to reduce the problem 
to the known case. We then have three possible bounded, linear operators constructed from 
A: 


(i) From the Spectral Theorem we know that if A is self adjoint then there exists a unique 
PVM P such that A is spectrally decomposable. Recall that, from Remark 10.7, 
P4(Q) € L(H) for any Q € a(Op). 


=119= 


(ii) From the definition of the resolvent set, we have R4(z) € L(H) for any z € p(A). 


(iii) Again, as A is self adjoint it is spectrally decomposable and so we can consider 
exp(itA) for some t € R, which was defined in Example 11.3. Again this is not a self 
adjoint operator, but it is unitary, which means || exp(itA)|| = 1. 


Definition. Let A be self adjoint and B be bounded. A and B are said to commute if 
either of the following holds 


(i) [Ra(z), B] =0 for some z € p(A). 
(ii) [exp(itA), B] = 0 for some t € R \ {0}. 


Definition. Let A and B be self adjoint. They are said to commute is one of the following 
holds 


(i) [Ra(za), Re(ze)| = 0 for some z4 € p(A) and zg € p(B). This is known as the 
Resolvent way. 


(ii) [exp(itA), exp(tsB)] = 0 for some t, s € R \ {0}. This is known as the Weil way. 
(iii) [P4(Q), Pp(Q)| = 0 for all Q € o(Op). This is known as the Projector way. 


Remark 11.14. The literature normally uses a practical, yet misleading, notation at this 
point. For any of the above we simply write [A, B] = 0 for commuting A and B. However 
this commutator is not the same as the one defined at the start — i.e. it does not correspond 
to Ao B— BoA. Really we should write it slightly differently to highlight this, i.e. make 
it red; | A, B]. 


Theorem 11.15. Let A and B be self adjoint and bounded. Then 


[A,B)=0 © [A,B] =0. 


== 


12 Stone’s Theorem and Construction of Observables 


In this lecture we aim to answer two questions by deriving and using Stone’s Theorem. 
They are 


(i) How arbitrary is the stipulation of Axiom 4; that the dynamics in the absence of a 


measurement be controlled by 
U(t) := exp(itH) 
for some self adjoint operator H? 


(ii) How does one practically construct observables, including the question of how to find 
the correct domain such that the operator is at least essentially self adjoint? 


Remark 12.1. Clearly for (4) we want U(t) o U(s) = U(t+ s) and U(0) = idy. 


12.1 One Parameter Groups and Their Generators 


Definition. A group is the double (G,), where G is aset and 0: G > G satisfying: 


(i) For allg,h,k EG, (Associativity) 
(gOh)Ok = gO(hOk). 


(ii) There existse € G such that for allg €G (Neutral Element) 
ge= eg = 9g. 
(iii) For all g € G there exists g~! € G such that (Inverse) 
909 =9° Og=e. 


Definition. A Abelian group is a group is one whose group operation is symmetric. That 


is for allg,heEeG sOn= TOs 
Remark 12.2. Abelian groups are also known as commutative groups and the condition is 
refered to as the commutativity of the elements with respect to the group operation. 


Example 12.3. The real numbers equipped with addition form an Abelian group, with e = 
0€Randg7!=-—g. 

Note it is important that we consider all of R, and not just the positive numbers, as in 
the latter case the inverse would not lie in the group. 


Example 12.4. The set R \ {0} form an Abelian group with respect to multiplication, with 
e=landg!=1/g. 
Note here we have to exclude 0 as 1/0 is not an element of R. 


= 2 1L 


Example 12.5. The set R \ {0} is not a group with respect to division as it fails to satisfy 
associativity. 


Definition. A one-parameter group is a group whose underlying set is 


G= U(t)t R, 
a ae 


and whose group operation satisfies 
U(t)OU(s) =U (d(t, s)), 
for some 6: Rx R- R. 
Remark 12.6. Unless the group is Abelian then 6(s,t) 4 d(t, s). 


We will only deal with Abelian one-parameter groups, in which case one can always 
reparameterise so that 
U(t)}0U(s) =U(t +8) 


where the commutativity with respect to ¢ is inherited from the commutativity with respect 
to +. We also choose the parameterisation such that U(0) = e. In particular, we will look 
at unitary one-parameter groups, i.e. those with 


G= {U(t) € LH) |t ER, U*(t)U(E) = idx, ||U(e)|| = 1}, 
that are strongly continuous in the parameter, 
Vw EH: lim (U(£)~) = U(to)v. 
toto 


Definition. Let U(-): R — L(H) bea unitary, Abelian, one-parameter group (UAOPG). 
Then its (generator) is the linear map 


A: DStone _, 14 
» Ab:=lim= Ue wv, 


BD E 0 € = 
where ) 


Stone , : a 1 
Dye = fap 71 lim . (U(e)h — wp) exists}. 
Remark 12.7. Note 


tim + (U(e)v — v) = i tim POPE COW oul, 


EOE € 


and so we can rewrite 
DS = Difore := {wb © H|i[U(-)y] (0) exists}. 


Note also that U(-)y “: RH. 


er 


12.2 Stone’s Theorem 


Theorem 12.8 (Stone’s Theorem). LetU(-) be a UAOPG that is strongly continuous and 
whose group operation is the composition of maps, 1.e. 


U(t) oU(s) =U(t+s) 
U(0) = idx. 
Then its generator A: Dy, > H is self adjoint on De. and 
U(t) = exp(—itA). 
Before proving this, consider the following. 


Corollary 12.9. Given U(t) = exp(—itA) for some self adjoint A then the Spectral Theo- 
rem tells us that U(t) is a UAOPG. 


Proof. (i) First show U(t) 0 U(s) =U(t +8): 
U(t) 0 U(s) := exp(—itA) o exp(—isA) 
=, e~ithe—is\ P(dA) 
= J e +9) P(dd) 
=: U(t+s), 


where Example 11.3 has been used. 


We also have 


Now show Abelian property: 


U*(t) := ( [ PD) 


= _etit\P(d)) 
=: Y(-t). 


Then from the above we have U*(t)U(t) = U(—t+t) = U(0) =idy. Then noticing that 
\|U (t)|| = ||U*()|| it follows that 


UI = Vllidw |] = V1 = 1, 


where we have used the fact that the norm is strictly positive to remove the negative root, 
and so it is unitary. 

Finally show that it is a group. This is easily done, and we have e = idy = U(0) and 
[U(@)}-* = U(-4). U 


= 124 


Corollary 12.10. Lety € De for some generator A andt €R. Then 
[U(-)v'(t) = -iAU (t)y, 


and 
UO = |. 


D D 
Try do proof later. 


Now we can proceed with the proof of Stone’s Theorem. To do so, we will need to 


show: 
(i) The generator A is densely defined (otherwise we A* wouldnt be defined properly). 
(ii) A is symmetric on D3 and that it is essentially self adjoint. 


(iii) U(t) = exp(—itA**), from which it follows that A = A*™ and so it is self adjoint (as 
A** is). 


Proof. (Stone’s Theorem) 


(i) Let w . If we can show that an arbitrarily small neighbourhood around w 
contains dlp € D4, then we know De is dense in H. Consider the real fd¥nily, for all 
TER 


— } drU (rt, 


im (3) = 


This is just the idea of the points in a neighbourhood, and so we know, therefore, 
that there exists a 79 such that #,, € NM. Now consider 


which satisfies 


(U(\; — dr) = UC) / tO = / “OO 


= drU(e+7)p drU (r)y 


| 
S 
o— 
ag. 
ea, 
= 
a 
| 
ra 
Qy 
ber 
aay 
a 
& 


| 


[U(r) — idy | ve. 


(ii) 


—124- 


Now using the fact that ||U(7)|| = || idz || = 1 and so they’re bounded, we can take a 
limit and push it through the operators. Thus we have 


i[U()abr]'(0) = lim = (U(e)r — br) 
=i U(r) idy lim =i 
= 6/0 (7) = id], 
which is an element of H, and so we know that w, € Di and therefore there exists a 


Um END. 


Let y, wy € De. then 
(y, Ap) := (tim = (WOW -¥)) 
= lim (= (U*(é) _ idx) ) 


E—0 & 
(tim * (U(-2) - das), ) 
= jinne U(e) —id# 
=liewe 
where we have used the continuity of the inner product to move the limit in and out, 


the result U*(t) = U(—t), the fact that the identity is self adjoint and the fact that 
we’re taking the limit to ‘ignore’ the minus signs on second to last line. 


| 


a 


Y, W 


We now want to show that it is essentially self adjoint. Recalling Theorem 7.26, we 
need to check if: for z € C \ R that 


ker(A* — Z) = {03} = ker(A* — z). 
Let y € ker(A* —Z) 1 D§.. Then for ally € D9 
p Uy “t= 9 [UOd 


( ; 


\ 
- 
is 
D 
= 


[ 


| 


= (9, U()b)(t) = (eve, 


where we have used Corollary 12.10 and the fact that U(0) = idy. But, since z 
is purely imaginary, the exponential is unbounded and so the RHS is unbounded. 
However, the LHS is bounded (as U(-) is bounded) and so the only way the equality 
holds is if (py, w) = 0. Finally since we took all = € H it follows that » = {07} and 
so ker(A* —Z) = {03}. The proof for ker(A* — z) follows trivially from this result — 
i.e. the RHS just becomes unbounded in the opposite direction. 


= 125-= 


(iii) We know that A is essentially self adjoint, which means that A** is self adjoint. Now 
construct 


f(t) := exp(—it A") = 7 eit} Dy. (dD). 
R 


Now let w € De C D,~« and consider the real family 


w(t) = rexp(—it A") = UO 
w(t) = [—tA* exp(—itA™) + AU (8) vy = -iA*Y (8), 


where we have used the fact that A = A** on D> Then we have 


(PEP) = WE), dO! 

= 2Re(v(t), v'(t)) 
2Re ( — i(y(t), A*Y(t))) 
= 0, 


Then 


where we have used the fact that (W~, A**v¥(t)) €R as A** is self adjoint. So we have 


that ||~(¢)|| is a constant w.r.t. ¢. From the definition, we have (0) = 0 and so 
\|(¢) || = |/Y(0)|| = 0, which from the definition of the norm tells us ~(t) = 0 for all 


t. Finally it follows that 


exp(—itA) =: U(t) =exp(-itA™*) => A=A™. 


12.3 Domains of Essential Self Adjointness ("Cores") 


Stone’s Theorem showed us that the generator A: Dp, — H is self adjoint. Sometimes a 
compromise in choosing the domain is in order, as we shall see in the two section’s time. 


Corollary 12.11. Inspection of the part (ii) of the proof shows that if one considers A: D > 
S 


H., for some dense D C DA that also satisfies U(t)D = DA, then we A is essentially self 
adjoint on D. 


12.4 Position, Momentum and Angular Momentum 


Employ Stone’s Theorem to properly and easily define these three operators in quantum 
mechanical systems. For the rest of this lecture we shall take H = L?(R°, A) =: L? where » 
is the Lebesgue measure. 


Definition. The position operators, denoted Q! for 7 = 1, 2,3, are defined as the generators 
of 
UF(.): L? = FP, 


with 


= 126= 


for (x) := (41, £2, 23). That is, they are the self adjoint 
Q): D3; 4 LD? 
with 
(Q’y)(x) = 2 Y (2). 


RemarkLe-Le. Note clearly U(t)oU(s) =U(t+s), U(0) = id? and ||U(£)~|| = ||~|| which 
tells us ||U(t)|| = 1, all of which are required for U(t) to be a UAOPG. 


Definition. The momentum operators, denoted P; for 7 = 1,2,3, are the generators of 
U;(-): L? > L?, 
with 
(Uj(a))(2) = V(....04 — a,...), 
i.e. they shift the j*" slot to the right by a. That is they are the self adjoint operators on 
their Stone domain that satisfy 
Pi = —10;~. 
Remark 12.18. Note this is exactly the definition we used for the action of the operator in 


Lecture 9. 


Definition. The orbital angular momentum operators, denoted L; for 7 = 1,2,3, are the 
generators of 
U;(-): L? > L?, 
with 
(U;j(a)b)(x) = ¥(D;(a)z), 
where D(a): R? + R® is the operator that describes the rotation about the j*® axis by 
angle a. They satisfy 
(Lit) (a) = —i(a*O3y) — 2° Oo) 
(Law) (a) = —i(a? Op — 2°03) 
(L3p)(x) = —i(a* Oy — 270 ~p) 
Corollary 12.14. The spectrum for the orbital angular momentum is contained within the 
integers; o(L;) CZ for j = 1,2,3. 
Proof. From Stone’s theorem we have 
U;(a) = exp(—iaL,;), 
which together with D;(a+ 27) = D;(a) gives 
exp(—i27L;) = idy. 
Then, from the fact that L; is self adjoint, we can use the Spectral theorem to decompose 
both sides 


[ eU™ Pi (da) = [ Pr,(da), 


andsorA Z. oO 
E 


Lae 


12.5 Schwartz Space S(R?) 


As we have just seen, Stone’s theorem gives us a nice way to define the position, momen- 
tum and orbital angular momentum operators. However there are two problems with the 
definitions we have, both of which relate to their Stone domains. They are 
S S S S 
(i) DOA DP ADLADQ 
(ii) Ds - Db 0Q a DB .Q0Q ~... and similarly for P and L. 


This, at first, might not seem like such a big deal but on a second look we see that 
it means havoc when it comes to trying to define the QM version of kinetic energy as 
(P o P)/2m. The problem is especially bad when it comes to considering commutators, as 
highlighted before. 

We get around this problem using the compromise given in Corollary 12.11. 


Definition. The Schwartz Space on R42, denoted S(R7), is the vector space with set 


S(R*) = {p €C™(R®) | ay |z*(Og)(x)| < 00, Va, 8 ENG}, 


where 


IN&d := NOx. NO, 


d-fold 
and 
gt = gp (Oly) = (x1)... (2%) 


Op = O(p4,...Ba) = (O1)**...(Ba)™. 


Remark 12.15. The Schwartz Space is also known as the space of rapidly decaying test 


y] 


functions. 


Remark 12.16. Clearly the space CS°(R%), as defined in footnote 15 in Lecture 9, is a 
contained within $(R7). 


Lemma 12.17. The Schwartz space is closed under pointwise multiplication; if w,p € 
S(R2) thenw yp  S(R2). In fact we have the Schwartz algebra (S(R%),+, ,_). 


e E * @ 
Proof. This result follows simply from the so called Leibniz Rule, which is an extension of 
the product rule.?” 


ay |z*(Oe(v # y))(z)| = guy |x" (Og(W) e p+ ve Og(~)) (x)| 


< sup |2*(29() « v)(2)| + sup |=" (sb « Spt) 
< OO. 


Then, using the fact that the pointwise multiplication of two smooth functions is smooth, 
we have ey € S(R%). Finally using the linearity of everything involved we get the 
algebra. LJ 


22See Dr. Schuller’s Lecture’s on the Geometric Anatomy of Theoretical Phsyics for a definition in 
context. 


= 128 = 


Lemma 12.18. For 1 <p < 00 we have S(R2) C L?(R®). 


Proof. Let w € S(R4). Then |7(z)| < oo for all  € R4, and so it is integrable. Then 
Corollary 5.19 tells us that it is measurable with respect to the Borel o-algebras, so w € 
D*(R?). Then finally from L1(R®) C L?(R®) for all p > 1, the result follows. LJ 


Lemma 12.19. One can show that the Fourier Transform is a linear isomorphism from 
S(R2) onto itself.*° 


Theorem 12.20. The Schwartz space as defined above satisfies 
(i) S(R*) C L?(R4, d) is dense, 
(ii) S(R*) C D3, DB, DP is dense, 
(iti) Q?: S(R?) + S(R3) is essentially self adjoint. Same for P; and Lj. 


Remark 12.21. From the last condition we see that we can repeatedly apply the operators, 
in any order, to a system. This fixes our problem above. 


23 See lecture 18. 


= 12 = 


13 Spin 


In the previous lecture we defined orbital angular momentum. The emphasis on ‘orbital’ was 
not a mistake; this lecture aims to discuss what is referred to as general angular momentum 
(or just angular momentum) in QM. This latter case gets its name from the fact that any 
concrete set of three operators, J ,J,J say, obey analogous commutation relations to 
io 3 
{Ih, La, D3}. 
Recall, for a € S(R?) we have 


L;: S(R°) > S(R°), 
essentially self adjoint with 


(Liv)(x) = —i(x Osh — x doe), 
(Lop)(x) = (x Op — 2 Os), 
(L3p)(x) = —i(x' Og — x70). 


We can, therefore, calculate their commutation relations. 


Lemma 13.1. The orbital angular momentum operators obey the following commutation 


relations: 
[L1, Lo] = 1D, 
[Lo, D3] = tL, 
[L3, £1] =iLe. 


Proof. 'The proof follows from direct computation. Consider 


[L1, Lal = (Ly o Lg — Lz 0 Ly) 
(—1)?(x?703 — 2°02) (xO, — x103yh) — (—i)? (2°, — 2103)(x?03y) — 2 dz~) 
—(x? Ov + 272 O30) — 27a OR — (23)? O20,4) + at x' 02031) 
+ 272°0,03y  (2°)?O,0oy atx? O34) + 1x3 d300W + 2 Oo 
x Ooah — iment = = ) 
= iL3y, 


| 


| 


| 


where we have used the fact that S(R°) C C®(R*) Cc C?(R3), and so we can swap derivative 
order, i.e. 


O02) = O201~p. 
The same method is used for the other two commutation relations. O 
Remark 13.2. The vector space with set V := spane{L1, Le, 3} can be defined, and we 


have that iL; € V, and so the above tells us that (V,+,-,[-,-]) is the orbital angular 
momentum Le algebra. 


— 130 - 


Remark 13.3. We can re-write the commutation relations in the compact and convenient 


form 
[Li, D5] = tein Le, 


where €;; is the totally antisymmetric Levi-Civita symbol, defined as 


+1 if (i,j,k) is an even permutation of (1, 2,3) 
€ijk = ( —1 if (4,7,k) is an odd permutation of (1, 2,3) 


0 otherwise. 


Proposition 13.4. It is not possible have a common eigenvector between the operators 
(Li, La, L3) 


Proof. Let V := spane{Lj, Le, L3}. Now assume that w € D \ {07} is an eigenvalue of 
both L, and L2 with eigenvalues A, u € C, respectively. Then we have 


[L1, Lely = Li (Lop) — Lo(Liy) 
= pL(W) — ALap 
= prAY App 
=G, 


where we have used the linearity of the operators. It follows from the commutation relations 
that L3y = 0. However, from the other commutation relations we then have 


[Lo, L3)y = iLywh 
Lo(L3y) — L3(Lep) = iap 
L2(0) — wL3p = idAp 
0 — 0 = idw) 
=> A=0, 


and similarly you can show pz = 0. It follows then that for any D V we have Dw = 0, 


which can only be true is Ww = OH. But this contradicts the openingassumption and so it 
can’t be true. Oo 


Remark 13.5. People often say that "two non-commuting operators have no common eigen- 
vectors", however this statement is not strictly true. What is meant is "two operators, whose 
commutator does not contain the zero vector in its range, do not have common eigenvec- 
tors." This is subtly different, however the distinction is important. For example, in the 
previous proposition if we instead had [L,, Lo] =iL3 and [L2, L3] = 0 = [L3, Li], we would 
not need to require A = 0 = p, and so, unless further constraints were placed on the system 
of operators, it is possible that ~ is a common eigenvector to Ly; and Lg. 


=1ol= 


13.1 General Spin 


At this point we might ask why we are bothering to work out these commutation relations? 
After all they appear to be of no use when it comes to calculating things such as the spectra 
of the operators (as is evident by Corollary 12.14). The answer to this is that we want to see 
what information we can obtain about the system (specifically its spectrum) using solely the 
commutation relations, as then any other set of observables that shares these commutation 
relations immediately obey the same results. 

To emphasise, given a Lie algebra that contains three operators, $1, S2, S3 say, with 


ogi DD, 


for some D CH, that obey commutation relations analogous to those of Lemma 13.1, will 
instantly satisfy any results we derive, using only the commutation relations, for the orbital 
angular momentum. 


Example 13.6. An example of such a set of operators are the so-called Pauli spin algebra, 
which has 
oe = 


with D = H = C?, where 


, . (01 a (97% {1 0 
1— 10]° oa ; o ) os 0-1]? 


known as the Pauli spin matrices. It is easily checked, through the rules of matrix multi- 
plication, that this algebra obeys the correct commutation relations. This is an example of 
a so-called spin-3 system. 


Remark 13.7. We can not expect the commutation relations to necessarily tell us everything 
about the spectrum (or any other quantity we try to calculate) as they can be derived 
by several different operator sets with potentially differing spectra. But, as said above, 
whatever we can infer from the commutation relations alone must hold for all the operator 


sets. 
13.2 Derivation of Pure Point Spectrum 


We start from a general Lie algebra with our required conditions. We shall denote the 
operators by J1, Jo, J3, however they need not be the orbital angular momentum operators. 
Equally the domain, D, is left arbitrary, up the condition that the operators are at least 
essentially self adjoint on them. 


Definition. A Casimir operator for the algebra (V, +, -,|-,-]), is a symmetric operator 
Q: DD 


that commutes with every element in V. 


aes 


Remark 13.8. Note, due to the bilinearity of the commutator, we only need to check that 
Q. commutes with the basis elements of V. 


Proposition 13.9. The operator 


Q:= J, Kt+d. Jot d3 Jz 


[e) [@) [@) 
is a Casimir operator for the algebras we’re considering. 


Proof. The symmetric part follows trivially from the fact that Corollary 12.11 tells us 
Ji, Jo, J3 are all symmetric. Next let ~ € D and consider 


[Q, Ai} := [Jr 0 J, + Ja 0 Jo + J3 0 J, S| 
[J1 0 Jy, Ji] + [Jo 0 Ja, Jil + [J3 0 Js, Jil 
(J, 0 Jy 0 Jy)W — (J, 0 Jy 0 Jy) 
+(Jg0 Jn 0 J) — (Jy 0 JQ 0 Jn) 
+(J3 0 J3 0 Jy) — (Jy 0 J3 0 3) 
= (J20 Ji 0 Ja)y + (J2 0 [Ja, Ji] — (J 0 J2 0 Jay 
+(J3 0 Jl o 33) + (33 0 [J3, J] — (JLo BB 0 JB) 
(Ji 0 Jn 0 Ja) + ([Ja, Ji] o Jo)y + (Jo 0 [Ja, Ji])b — (Ji 0 Ja 0 Ja) 
+(J1 0 J3 0 J3)b + ([Js, Ji] 0 Jab + (J3 0 [Js, i) — (Ji 0 Js 0 Ja) 
= —i(J30 Jg)p — i(J2 0 Jz)W + (dg 0 Jz) + i( J 0 Jo) 
= 0, 


| 


| 


| 


which because 7 € D was arbitrary tells us [Q, J;] = 0. The same method gives [Q, Jo] = 
0 = [Q, J3]. 0 


Definition. Let J, Jo, J3 be three operators that satisfy our conditions. Then define 
da t= J a tds 


J-:= Jl —1J52, 
known as the ladder operators, for a reason that will soon become clear. 


Remark 13.10. We can choose to consider the set {J,, J_, J3} in place of the set {J1, Jo, J3} 
while still keeping all the information — as we can simply reconstruct J; and J2 from J, 
and J_. Note, however, in doing this we have broken the symmetry of the algebra (in the 
sense that none of the J;s are special, they all obey the same commutation relations) by 
singling out J3, while taking linear combinations of J; and J. Indeed we did not need to 
make this choice of symmetry breaking, but in fact we could have chosen to keep J; while 
defining J, and J_ as linear combinations of Jz and J3. Importantly, the results that follow 
will hold equally for whichever J we choose, and so in order to stick with convention we pick 
J3. Note also that we no longer have a set of observables as (Ji)* = J— and (J_)* = Jy. 


=1da = 


Lemma 13.11. J, and J_ satisfy the following commutation relations 


[deel | = Oe, 
[J3, Jz] = tJh, 
[Q, J+] = 0. 
Proof. From direct computation. Not done here to save space. O 


Lemma 13.12. We can rewrite our Casimir operator as 
Q= J, oJ_+ J3 0 (J3 —idp), 
or equally as 
Q= J_o J, 4+ J30(J3 + idp). 
Proof. We shall just show the first one: 
JA. o J_ + J3 oO (Js = idp) = (Jy + iJ2) oO (Jy = i Ja) = J3 oO (J3 = idp) 
Jy Jjt+do Je (Jy 5 Deb) Ji) +J3 Jz Jz 
J, 0d, + Jg0 Jn + J3 093 —4[J1,I9] —J3z 0 _ 


Jy 0 Jy + Jg0 Jo + J3 0 J3 — (1)? J3 — J? 
Q. 


| 


| 


| 


| 


0 


Remark 13.13. Both of the expressions in the above definition are always true. It is not 
that one is true under certain circumstances and then the other is true. This is an important 
observation that we shall use. 


At this point we might wonder why we are going through so much effort introducing the 
Casimir, when what we’re looking for is the spectra of the operators Jj, J2, J3. The answer 
is to make the problem seemingly more complicated by now considering only eigenvectors 
that are common to both J3 and 2. Note it is necessary that they commute if they are to 
have common eigenvectors — as is easily verified from the definition of the commutator. 
That is we want to find aw), € D \ {0}*4 such that 


J3W> pu = UW) 
QW), y _ AW), us 


where the subscript is included in order to label the eigenvector by its eingenvalues. 
Again this appears to be a more complicated problem — we now not only need to 
check our w is an eigenvector of J3 but we also need to check that it’s an eigenvalue of 2. 
However, we can show that every eigenvector of J3 (and equally for J; and Jz) is also an 
eigenvector of ). For a proof of this see Peter Ferguson’s well detailed answer on Quora. 


4Recall that an eigenvector can not be the zero-vector by definition. We shall use this later. 


— 134 —- 


We might also ask whether this will give us the spectrum of J3 anyways, as J3 is only 
essentially self adjoint on our Schwartz domain, D. We are OK, however as (J3)** is self 
adjoint on the D and so we could just use this instead, where the operation is defined to 
be the same as J3 — just as we did for the momentum operator in Lecture 9. 


Lemma 13.14. The eigenvalues for these common eigenvectors satisfy 
r= |u\(H| + 1). 


Proof. We shall consider both cases for the rewriting of 2, however, as explained above, the 
cases bracket does not mean one is true under certain conditions and the other otherwise, 
they are both true. We shall also drop the o symbols to lighten notation. Thus we have 


MD w Pru) = (Wr, ps OV, y) 
“4 po (Jz J_ + J3(J3 — idp)) va,) 
(by iis (J dy + J3(J3+ idp)) wy, a) 
_ oo TWaru) + we — VY) br,ws Dru) 
(Ja Dru F4- Pru) + MHF LY) bra, Yr,u) 


where we have used the fact that (J—)* = J+ and vice versa.25 
Now recalling that ~,,, is an eigenvector, and so, by definition, not the zero vector we 


know the inner product is positive definite, and thus we can divide by it, giving 


hee “isi a Up — 1) 


J 2 
wie ap u(u + I). 


Finally, from the fact that the norm is non-negative definite (i.e. the first term in each case 
is either positive or vanishes) we have 


a(n? 
~— Lee + 1) 


p+) 


and so it follows that A > ||(|u| + 1). O 


Lemma 13.15. The elements Ji, are common ‘eigenvectors’ of Q and J3 with eigen- 
values X and (1+ 1), respectively. 


Proof. First consider 2. We have 


OF4£0) p= F420) py + (O, Jlrs 
=> Je (Ady, y) 
= (Jer), 


?°Really what you need to do is expand out J_ and J, in terms of J; and Jz and then take the adjoint. 
The result holds. 


=135.= 


where we used the fact that [Q, Ji] = 0, and the linearity of J. 
Now consider J3, 


Jp Ja0y uy = Jt J30y p+ (Js, Jel p 
= w(J4%, 2) + Sedna 
= (Ls a 1)(J£yr,x), 
where again we used the commutator, [J3, Ji] = +J+. EI 


Remark 13.16. 'The previous allows us to conclude that Je), « Yy y41- 


Remark 13.17. This is why J+ are known as ladder operators, with J, known as the raising 
operator and J_ the lowering operator. As we see these names derive from the eigenvalues 
they produce as eigenvectors of J,. 

If the J, eigenvalue of ~,,, corresponds to the y-th rung of a ladder then the eigenvalue 
of Jw, corresponds to the (w+ 1)-th rung, and J_7),,, the (js — 1)-th. Note each one of 
the rungs is separated by exactly the same distance, and that we get the (w+ n)-th rung 
from (J+) ,,, and similarly for the (yw — n)-th rung. 


al J+, (u+1) | 2 
2 = 
3 8 
- Wr <u bt q 
oO o 

a0 
AY Jd) (@—1))") 


The next question would be is this a ‘proper’ ladder; that is does it have a top and bottom 
rung or does it continue forever? The answer comes in the form of the next lemma. 


Lemma 13.18. There exists a7, such that Jepyq = 0. Equally there exists ayy, such 
that J_W>,y = 0. 


Proof. We know from Lemma 13.14 that |u|(|u| +1) < A holds for any common eigenvector 
of Q and J3. We see from Lemma 13.15 that (J+)"~),,, is such a common eigenvector, and 
so must obey |u+n|(|utn|+1) < A. However, \ is unchanged by this repeated application 
of the ladder operators, and so, unless remedied, this inequality will eventually be broken 
—that is we need to somehow cap the available n values. 

Consider first the raising operator. In this case ~+n gets bigger and bigger, and so 
we need to cap n from above. In other words, we require there to be an m € WN such that 
for alln > m, (J+)"Wy,, = 0. This fixes our problem as this corresponds to the zero vector 


= 136— 


and so, by definition, it cannot be a eigenvector, and the A inequality no longer need hold. 
We define 
VT = (Ja) Wry 
The idea is exactly the same is true for the lowering operator, however now lu — n is 
getting smaller and smaller, and so its modulus (after n > ys is reached) gets bigger and 


bigger. So again we need to cap n from above. We require there to be a @ € N such that 
for alln > £, (J_)"W),, = 0. We define 


ee) Via: 


Note we do not have any a priori relation between the values of m and @. To use the 
ladder analogy, m is the number of rungs above p-th rung and @ is the number of rungs 
below the y-th rung. For a given 4, the highest value of is denoted 7i(A) and the lowest 
value ju (A). O 


Remark 13.19. Note the above tells us that J+ are not strictly eigenvalues, as it could be 
the zero vector. This is why we wrote ‘eigenvectors’ in inverted commas in Lemma 13.15. 


Proposition 13.20. The maximum and minimum values of ts satisfy 
(i) X= F)(HA) +0), 


(ti) w(A) = —B(A), 
(iit) TA) € ®. 
Proof. (i) From the proof of Lemma 13.14, and the fact that JyW)7,) = 0, and so 


|| J+%a7()|| = 0, we have 
d= T(A)(B(A) + 1) 


(ii) Repeating the above argument but with the fact that ||J-~) || = 0, we have 


and so p(A) = —f. 


(iii) From the previous, along with the fact that f(A) — w(A) € No this result follows 
trivially. 
OU 


Remark 13.21. In order to be consistent with the literature we shall introduce the following 
relabelling 


Note we have j € ho 


1st = 


Theorem 13.22. The common eigenvectors of 2 and J3 come as families W5(j+41)m; where 
m= —j,—-j+1,...,.j -1,7. The eigenvalue 7(j +1) is associated to Q andm is associated 
to J3. 


One normally normalises these eigenvectors and defines 


Aid 
P5541) ,mll 


Dim 
Then, from Lemma 8.8 and the fact that the eigenvectors have distinct eigenvalues, we have 

(jm; Pan) = Syn Omn- 
Corollary 13.23. We have m € Z. 


Proof. This comes from just allowing 7 € Bo to be any element and then using m = 
a eB OU 


Proposition 13.24. The the common eigenvectors ®jm satisfy 
J+ ®jim= JG+1) mlm 1)®j mz. 


= = 
Proof. We shall show this for J,.,;he method of J_ follows analogously. Recall from 
Lemma 13.12 that 
J_J, =0- J3(J3 = idp). 


Now consider 


= (85 m, AB; m) — (Pj. m, J3(J3 + idd)®; m) 
j + 1) (D3 ms D5 m) —_ m(m zo 1)(®5 ms Dj m) 
j+1) —m(m+1). 


Combining this with Remark 13.16 allows us to conclude the result. U 


13.3 Pure Spin-7 Systems 


Definition. A quantum mechanical system is called a pure spin-j system if its Hilbert 
space is (27 + 1)-dimensional that possesses an orthonormal eigenbasis {®;,,,} for the three 
operators J), J2, J3 defined on H. 


Corollary 13.25. The Hilbert space is isomorphic to C2J+1. 

Corollary 13.26. For a pure spin-j system the spectrum of the operators is 
o(Ji) = {-9,-G +1,...,5 — 1,5}, 

jor i= 1.2.3: 


Example 18.27. 


= 138= 


1/2) 4=1/2:1/72} 
1 {=1,0;1} 


SALAH dy amA PHBE BH CISERESI SAE 19 FAN RS fs He Pe gp ARE TS Pecomes a Product 
H,. = L7(R®) @ C?. 
We shall return to this and expand on it in the next lecture. 


Remark 13.29. You can also have non-pure spin systems. For the orbital angular momen- 
tum, you take a direct sum of the Hilbert spaces. That is if H; is the Hilbert space associated 
to the pure spin-j system then the composite system’s Hilbert space is 


Heomp = ED Hj. 
J 


We shall return to this in two lectures time. 


=a 


14 Composite Systems 


Recall Axiom 1, which says that to every quantum system there is an underlying Hilbert 
space. The question we now want to ask is: Let H, be the Hilbert space associated to one 
system and Hz be the Hilbert space associated to another. What is the underlying Hilbert 
space associated to the composite system? 

To clarify what we mean, imagine having a proton and an electron. We first look at the 
proton by itself and call this system one. We then look at the electron separately and call 
that system two. We now want to look at both of them together, but we wish to use the 
fact that we have already studied them separately to simplify the problem. It may seem 
‘natural’ to model the composite H as the so called direct sum, which as a set is”® 


Hy ® Ho = {(v, y) | Ei, ¢ = Ha}. 


and where the linearity is inherited from H, and H2, namely 


(ayy + 2, bp + G2) = ab(y1, Y1) + a(Y1, G2) + b(a2, Y1) + (Wa, 2). 


This is what we do in classical systems and it tells us that if we know everything about the 
states?’ of our two systems, then we also know everything about the states of the composite 


SYStEBwever, as with all things quantum, things are more complicated, and the above is 
not the case. The main problem comes from the fact that not all linear combinations of 
elements of the form (w, y) can also be written in that form. 


Example 14.1. Let w1, Wo © H1, Y1, 2 € He and a,b € C. Then, assuming the linearity as 
above, we have 
a(wi, pi) + ble, ya) = (a1, 91) + (2, bye) 
ayy + 2, pi + bp2) 
ali, 91) + b(We2, po) + ab(y1, p2) + (wa, 1), 


a clear problem. 


Note this example actually tells us that we H, © Hg is not closed under the linearity, 


and so would not be a vector space. We could just restrict ourselves to elements that 
do obey these rules, however, as we shall see when considering entanglement, we require 
elements of this form in our underlying Hilbert space. 


This calls for a slight refinement of axiom one. We add the addendum?”®: 


If a quantum system is composed of two (and hence, by induction, any finite number) 


of ‘sub’systems, then its underlying Hilbert space is the tensor product space H,1@H2, 
equipped with a inner product. 


26 This definition holds as we are only taking the direct product of two spaces, and so the index set is 
finite. See wiki for details on this. 

7Recall that the elements of the Hilbert space are not the states, but are associated to them. We shall 
return to this at the end of the lecture. 

?8We shall define what these new terms are in the next section. 


— 140 - 


Example 14.2. Remark 13.28 is an example of such a composite system. 


14.1 Tensor Product of Hilbert Spaces 


In order to give a nice definition for the tensor product of two vector spaces, we first need 
to introduce the so called free vector space. 


Definition. Let V be a F-vector space and let B C V be a generating subset of V (i.e. 
any element of V can be obtained via finite linear combinations of elements of B). The the 
free vector space is 

F(B) :=span,(B), 
i.e. the set of all linear combinations of elements of B. 


Lemma 14.3. Every vector space is a free vector space with B being a Hamel basis. 


Remark 14.4. Note it need not be true that F(B) = V, as it might be the case that the 
same element in V is reached via two different linear combinations of elements of B. In 
fact if F(B) =V, then B is just a Hamel basis. 


The free vector space for vector spaces might seem almost redundant, given that every 
vector space has a basis. However if your vector space is countably infinite then such a 
basis might be incredibly difficult to construct. However you can simply take the entire set 
for B and construct the free vector space F'(V), which will be a huge set, mind. Note, then, 
that any linear combination of elements in this set is automatically still in the set, and so 
it is indeed a vector space. 


Definition. Let V and W be two F-vector spaces, and let A C V and B CW be generating 
subsets. Then we define their tensor product as the vector space with set 


V@W:=F(AxB)/., 


where x is the Cartesian prodcut and ~ is an equivalence relation such that: if a,a,,aq € A, 
b, 61,62 € B and f € F then 


(i) (a, b) = (a, b), 
(ii) (a1, b1) + (a1, 62) ~ (a1, 61 + bg) and (ay, 61) + (a2, b1) ~ (a1 + a2, 61), and continued 
by induction, 
(ili) f(a,b) ~ (fa,b) and f(a,b) ~ (a, fd). 
(iv) Combinations of (ii) and (iii), e.g. (a1, 61) + f(a1, b2) ~ (a1, b1 + fe). 
Remark 14.5. Note the equivalence relation looks a lot like a linearity condition on V @W, 
however on closer inspection it is not quite. The linearity condition that make V & W into 


a vector space is simply 
f(a1,b1) + (a2, b2) EV @ W. 


— 141 —- 


This, in itself, does not need to satisfy the equivalence relation. However, if we did not 
include it we could end up with a huge redundancy in elements, as a repercussion of Re- 
mark 14.4. This equivalence relation makes the corresponding set of equivalence classes a 
vector space in the way we normally think of them (there is no repeated elements). 


before Ebi ACA ihS Pett ABT HMEY TAY BASE bead erage bherpreblem highlighted 


require that linear combinations of linear combinations are linear combinations, which they 
obviously are. 


Proposition 14.6. Let H, and H2 be our two vector spaces. We can define the map 


+41,@H: (Hi ® He) x (Hi ®@ He) > (Hi ® He) 
([(2, y1)I, [(a), »2)]) rs [(w, Y1)| TH1@H2 [(w, 2) = [(w, ~1) + (2, ~2)], 


where the additions inside the brackets are w.r.t. Hy, and Ho. 


Proof. We need to show this is well defined. We shall write +12 now in order to lighten 
notation. Consider it case by case. 


(i) Assume (<p, 1) = (w, ~1). 
(a) If (4, Go) = (a, Yo), then it follows trivially that 
[(e, 1) + (eb, £2) = [(w, 1) hs (w, y2)], 


and so 


[(w, ¢1)| Pe [(eb, £2) = [(w, ¢1)| “Pig [(w, 2). 
(b) If (ab, p2) = (~, oy) + (y, v3), where y2 = ys — oe, we have 


[(b, i) + (, Ga)] = (Cd, G1) + (, wd) + , 93) 
[(v, pi + vo + 5) 


RB oy CD wadh 


lI Il 


and so 


[(e, ¢1)| ge) [(eb, £2) = [(w, ¢1)| “rig [(w, 2). 
(c) If (, Gs) = f(, 93), where yo = fy}, then 


[(w, 1) oF (w, y2)], 


| 


| 


and so 


((, e1)] +12 [(, 2)] = [, ¢2)] +12 [(, ¢2)]- 


— 142 - 


(ii) Assume (1, 1) = (W, pt) + (¥, 7) where y1 = pt + YF. 
(a) If (w, G2) = (v, 2) then we have essentially the same as (i)(b), so we wont 


re-write it here. 


(b) If (w, pa) = (¥, 92) + (%, 43), where yo = y2 + 3, we have 


[(w, Pi) + (, G2)] = [Cb, et) + (, GT) + (WH, 95) + (, 93)] 
" yr + yi t+ 93+ ¥3)] 
v, p1 + %2)| 
yi) + (b+ ya)]; 


| 


| 


[eb 
| 
| 
(y 


eeees 


| 


and so 


[(b, 11) a [(2b, ~2)| = [(2, y1)] +12 [(a, ~2)). 
(c) If (x, (2) = f (w, y3) where yo = fp, then we have 


ve git eit fea) 
_ ~~ = [ei + 92) 
= [(v, i) + (w, yo) 


and so 


(Ce ¢1)| +12 [(e, £2) = [(w, v1) +12 lw, y2)). 
(iii) Assume (¢, 41) ~ 9(¢, v3), where 1 = gp}. 
(a) If (w, G2) = (v, v2), then we have basically same as (i)(c) and so we wont write 
it again. 


(b) If (w, Bs) = (w, pd) + (v, $3), where go = yh + 3, then we are basically the 
same as (ii)(c) and so we wont write it again. 


(c) If (b, pa) = f(b, 2), where pa = fipz then 


[(b, Bi) + (cb, Ga)] = Lo, Gt) + FOL, ¥3)] 
(b,99% a7 fos) 

= [(Y, G1 + ¥2] 

(b, yr) + (%, Y2)] 


— ee ee 


and so 


(Ce 1) ae [(e, £2) oa [(w, ¢1)| +12 lw, yo). 
O 


Remark 14.7. We can do exactly the same thing but for a map that has the first element 
different and the second element the same. 


— 143 — 


Definition. Let #1; and Hz be complex Hilbert spaces with sesqui-linear inner products 
(-,-)a, and (-,-)3,, respectively. Then the composite Hilbert space is the Hilbert space with 
set 

Hy ® He = F(Hy x Ha)/n, 


where the overline indicates the topological closure, and with sesqui-linear inner product: 
for 11, Wo € Hy, and 1, ~2 € Ha, 


([(1, er)] (2, ¥2)]) 94, 24g = 1» Pa) + (Pr, P2) Hos 


extended by linearity, with respect to which the closure is taken (i.e. the topology is derived 
from here). 


Remark 14.8. Note, we need to take the topological closure as the free vector space only 
considers finite linear combinations, but our Hilbert spaces could be infinite dimensional. 


Proof. (that we have a sesqui-linear inner product). 


(i) Conjugate symmetry. 


[(o, p)], (2, ?)], Hi@He = - (pl, (p2)H2 


(ql ,we)Hi 
( ) = (ba, dian - (ea, 91) He 
= ([(v2, ¢2)], (1, $1) ay, oro 


(ii) Linearity in second argument. The extension by linearity means 


([ (W1, Y1)| >» 2i| (Wi, yi) Vee oe ik (W1, Win 4? (is Yi) Ho 
= Tal a) [(1, Y1)| ’ [(i, ee 


for z; EC. 


(iii) Positive-definiteness. As , ,, , 4, Oitfollowsthat?? , yey, 0. Then 
from re (- -) 
(07,, ~) — (0 : Y, ~) s O(a, y) oo (~, O- y) = (w, O45) 
we have 


(O71, 9)] = [(Y, Ont2)] =: Onr@ne- 


Finally, from 


= (Ww Ds 2) a, ems 


= (y, ine . Y) Ha» 


which implies either =) = 07, and/or y = 0y,, and so [w, y] = Oy, @n- 


?°We shall use ‘—’ for empty slots on the composite space. 


— 144 —- 


We also need to check that the sesqui-linear inner product is well defined. 


Proof. The proof follows a similar method to the proof of Proposition 14.6. We shall just 
show the first two results here in order to save space. 


(i) Assume (11,91) = (v1, 1). 
(a) If (Wo, Ga) = (wW2, yo). The inner product result follows trivially. 
(b) (tbo, G2) = (he, v3) + (we, v§), where yo = ys t Ya, 
(11, Bi], ftba, Fa) 9 = (1, Prd] [(eb2, 93) + (b2,95)]) 19 


( 
=a J, [(h2, 93)] +12 (be, £3)1) 10 

ee y1)], [(aho, 3) ie “i ({(1, y1)], [(ah2, 93)1) 19 
= (1 

(1 


— 
NN 
= 
paar 
6 
pany 
—’” 


2)1(P1, P3)2 + (1, 2) 1(~1, 92)2 
21 ((p1, ¥3)2 + (v1, 98)2) 
2)1(Y1, 93 + Yo)2 


(1, 2) 1 (pl, y2)? 
({(b1, ¥1)], (bo; P2)]) 19° 


We introduce the new notation 


pk yp := [(y, ¢)]. 


Here we have used a X for the tensor product of two vectors. We have done this in order to 
highlight the fact that it is not the same thing as ®, which is the tensor product between 
vector spaces. We will, however, end up using ® for all tensor products later, as this is the 
common notation. It is important to remember that they are distinctly different objects, 
and, if in doubt, we should go back to the definitions to clarify the circumstance. 

In this new notation we can rewrite the definition for the sesqui-linear inner product 
simply as 

(pp, PB) a xn = (YP) 9 (P,P) Hos 


extended by linearity. 


Example 14.9. This example acts as a further warning that its important that we consider 
the space F(H1 x H2) and not just Hy x He. 

Let H, = Hy = C?. Then we can express the elements at 2x1 matrices, in which case 
we can consider X to be the outer product. Note then that 


(1) #(1) (=o) (oo) (8) Cag) 


is in , but it cannot be written as w Xl y for some w and 
H! @ H? a € H? 


— 145 — 


Theorem 14.10. Let {e;}i=1,... .dim(H11) 04 {fihiat,....dim(H2) be a Schauder (ON)-bases for 
H, and Hz respectively. Then we can construct a Schauder (ON)-basis for Hi ® Hz as 


{e, ® fj}i— 1,...,dim(H2) 
j= 1... .,dim(H2) 


Corollary 14.11. We can rewrite 
dim(H1) dim(H2) 


7h @ Hai > d, aie; Wl f; 


from which it also follows that 


ai EC >e Jai|? < oo} 


dim(H1 ® Hz) = dim(H1) - dim(H2). 


14.2 Practical Rules for Tensor Products of Vectors 


This short section just highlights a couple rules obeyed by &. 
(i) Let W1, we € H1, 91, 2 € He and a, 8 € C. Then the following holds 
(¢1 + ave) K (yi + By2) = v1 B git ape Wy + Bd1 BW 2 + above. 
(ii) Given that {e; X f;} is a basis, we have: 
YWUECH, @ Hq dajyyeC :VU:= S 5 aijei B fy. 
14) 


Remark 14.12. Note, obviously, that the order matters when taking a tensor product. In 
other words, in general 


YRypFyhy. 
Note, its not even a case of ‘choosing the right ~ and y’, as the LHS is an element of 
H, © He whereas the RHS is an element of Hz ® H 1. So, unless the two spaces are the 
same, they could never be equal. 


14.3 Tensor Product Between Operators 
Definition. Let A: H, > H, and B : Hy + Hp be linear maps. Then we define their 
tensor product as 
AQB -H, ® Hy > Hi © He 
pyr (A@B)() By) := (Ay) (By). 


Theorem 14.13. Jf A: H, ~ H, and B: Ho > Ho are self adjoint then their tensor 
product A@B is also self adjoint on Hy ® Ho. 


~ 146 - 


Proof. We have A= A* and B = B*, i.e. that their domains coincide and Aw = A*w for 
all € Dy and similarly for B and B*. Then we have 


ASB :D,@ Dep Hi @ He 
yk yr (Ay) & (By), 
and 
A*@B* :D,@Dzp- H1 @ Hp 
pW prs (Aty) B (By) = (Ay) B (By), 


and so the domain concides and they have the same result for all yw Ky € D4 ® Dz. So it 
is self adjoint. O 


Theorem 14.14. Jf A: H; ~ H1 and B: He > Ho are self adjoint then 


(i) o( ASB) = 0(A)-o(B), where the overline is the topological closure and the - indicates 
all possible products of elements in the sets. 


(ii) co(A®idy, +idy, ®B) = o(A)+0(B), where again the overline is the topological 
closure. 


An application of the second of these results finds use when you know how to measure 
the observable A on system 1 and B on system 2, then you can measure them on the 
composite system. 


14.4 Symmetric and Antisymmetric Tensor Products 


Recalling Remark 14.12, if we do have H; = Hp it is possible to define a symmetric and a 
antisymmetric tensor product. These definitions are important in quantum mechanics as 
they allow us to categorise particles according to their so called exchange statistics. The 
symmetric composite system concerns a system of two (and by induction, any number) of 
bosons, whereas the antisymmetric one corresponds to fermions. These are both examples 
of what are known as indistinguishable particles, meaning that two fermions of the same 
type (two electrons, say) cannot be distinguished from each other. A good analogy is to 
consider two identical looking balls. Imagine being in a room with the two balls on the 
floor. Someone asks you to leave the room and then calls you back in. They then ask you 
whether the two balls, still in the same places on the floor, have switched places or not? 
Of course there is no way for you to know, as they look identical, and you weren’t present 
when they potentially could have switched. 

The version in QM is related to whether they live on the same Hilbert space. Recalling 
Remark 13.28, we see that this means that, not only are they allowed to move within 
the same physical space, they also have the same angular momentum (or spin). For the 
two particles to be indistinguishable, their composite Hilbert spaces must be the same. 
For example, if a divider was put between the balls, and you knew the balls could only 
move along the floor, you would know that they couldn’t possibly have changed places — 
this could correspond to one electron having [L?(U, A) and the other having L?(V, A) where 
U,V Re withU V= 

Cc a i) 


~ IAT 


Definition. Let w,y € H, then we define their symmetric tensor product as 


VOy:=5(B—+ By), 


which is an element of the symmetric composite Hilbert space, defined as 


dim(H) 
HOH := a axe; Ll e; ee a 
i,j=l1 4,9 


where {e;} is a basis of H. 


Remark 14.15. Note it follows from the definition that a;; = a; for a symmetric composite 
Hilber space. 


Definition. Let ~, yp € H, then we define their antisymmetric tensor product as 


pA yo := 5 (YB yp — pW), 


which is an element of the antisymmetric composite Hilbert space, defined as 


dim(H) 
HOH := S- a,je,O ej | ai €C,  |aigl? < 00 
1,j=1 1,9 
Remark 14.16. Note it follows from the definition that a;; = —a, for a antisymmetric 


composite Hilber space. 


Remark 14.17. For ~ € H we have y @w = 0, which is known as the Pauli exclusion 
principle for Fermions. 


Remark 14.18. For w,y € H where w and y are linearly independent, then 
i 
pO y= svhye—-yRy) Avy, 


for some ~, Y € H. Which again emphasises that its important we consider the space of all 
linear combinations. 


Definition. Let A,B :H — 4H be linear operators. Then we can define the symmetric 
tensor product of linear maps as 
AOB:HOH + HOH 
bly (ASB) OH ¢) = (Ad) H (By). 
Definition. Let A,B: H — H be linear operators. Then we can define the antisymmetric 
tensor product of linear maps as 
AOB:HOH —~ HOH 


pA (AOB)WA vy) := (Ay) O (By). 
b> 


— 148 — 


14.5 Collapse of Notation 


As mentioned before, we shall now change our notation to that of the standard literature. 
That is 


®,WH,® > ®, 

©,4,6 +0, 

@,0,0 > A. 
14.6 Entanglement 


As has been stressed many times, recall 
{Pp @ plv € Hi, ~ € Ho} S Hi @ Ho. 


Definition. We call an element VW € H,; ®Hg2 simple if there existsaw € H, anday € H2 
such that 
V=yY@y. 


If it is not of this form (i.e. you need linear combinations) then it is called non-simple. 


Recall: A state p: H — H is called pure if 


v, 
IpeH : VacH: py(a) = Eh 
or, equivalently, we can think of 
(y, ) 
py(-) = Gane 
Definition. Let UV € H,; ® Ho. Then a pure state py on the composite system is called 


non-entangled if there exists py and py, for W € Hi and y € Hz such that3° 


Pu = Py @ Py: 
Otherwise, the state is called entangled. 


Lemma 14.19. A state pv is non-entangled if and only if VU is simple. 
Proof. Assume WV is simple. Then 
«AU, +)i9 
pu( ) ees (W, W) 10 
() @ 9, *)12 
()@ 9, P @ y)12 
__ Wile. )s 
(b, DP, ¥)2 
= (wy, 1 ) ( (YQ, )2 ) 
(= Bae 7 (Y, pa” 
= (py(-) @ (pol) , 


3°Note the tensor product here is that between linear operators. 


) ) 


— 149 — 


where in the last two lines the & is the tensor product between linear maps. 
The reverse part of the proof (starting from py non-entangled) follows from working 
backwards through the above. L 


Lemma 14.20. A state py is entangled if and only if V is non-simple. 


Proof. This proof is trivial given the previous one, as if V is non-simple then py cannot be 
non-entangled, and vice versa. 0 


= 150 


15 Total Spin of Composite System 


The lecture aims to answer the following question: "What is the total angular momentum 
(or spin) of a bi-partite system if we know the spin of each constituent system?" 
More precisely, in the context of quantum mechanics, consider a spin-7,4 system with 


A 3 
Fi BeGy spate 4, and aneuin romenvin OpeuncnBa27B, “then brnas ithe opine 
the composite system with Hilbert space H4 ® Hp ae how do we construct the angular 
momentum operators for this composite system? 


Proposition 15.1. The operators A; ®idy, for 1 = 1,2,3, satisfy the spin commutation 
relations. Similarly for idy, ®B;. 


Proof. We shall use the general expression involving the Levi-Civita symbol. Consider the 
action on a general element a ® 8 € Hy, ® Hp, 


[Aj ® idy,,, Aj ®@ idy,|(a @ B) := (Aj @idy,,) ((Aja) ® B) — (A; ® idy,) ((Aia) @ 8) 
= A;(A;a) ® B — A;j(Aja) ® B 
= (AA A joe B 

(Ai Ailay@2 )® 

= 1€jjn (Apa) @ B 

t€ijk(Ap @ idy, )(a ®@ B), 


| 


| 


which because a ® 6 was arbitary (or equivalently by the linearity of the operators) this 
holds for any element of H4 ® Hep. 
The method is identical for the idz,, ®B; case. LJ 


Now before moving on recall (page 137) that we have an ON-eigenbasis*! for each 
constituent system. That is if A? is the Casimir operator for the spin-j4 system then we 
have the ON-eigenbasis 


{of 4 binaa—dawenda 
with 
Aram = ja(jat+ lay. 


Similarly we have B? and 12,3 (= =9 Beds): 


15.1 Total Spin 
Remark 19.2. From now on we shall simply write 1 instead of idy, and idy,, and the 


placement relative to the tensor product will indicate which is meant. 


In everything that follows it is important to note that 74 and jp are fixed. This 
condition shall come in use later. 


31QN here stands for orthonormal. 


=151— 


Definition. We define the self adjoint angular momentum operators J, Jz, J3 on the com- 
posite space H4 ® Hp as 
dp = A; @1+1@ B;, 


and we call them the total spin operators. 


Proof, (that they obey the spin commutation relations) 
onsider 


[A; @ 1,1 @ B;|(a @ B) := (A; @ 1)(a @ BjB) — (1 @ B;) (Aja @ B) 
= (Aja) ® (Bj 8) — (Aia) ® (Bj) 
=i, 


which from the fact that the commutator bracket is antisymmetric in its entries, along with 
Proposition 15.1 gives the result. O 


Definition. We define the Casimir operator for the composite system as always, 


i=1 
Definition. We define the total ladder ope rs as 


Jp = Aol 16 be. 


We now want to find the common eigenvalues of J? and one of the total spin operators, 
J3 say. We will show the following results: 


o(J*) = {|ja —Jal,-.Jat ja} 
o(J3) = P=(JACP 9B arg JA 9B 


Remark 15.3. Note from Theorem 14.14, we can already obtain the second of these two 
results. That is 


o(J3) = 0(Az3 @1+1@ Bs) 
= g(A3) + ¢(B3) 
= {-JAy---JA} “ {—jp,--5,JB} 
= {-(j4+ jp),--,Jat JB}. 
15.2 Eigenbasis For The Composite System in Terms of Simultaneous Eigen- 
vectors of A? @1, 1@ B?, Az @1 and 1@ B® 
We already know that {aj"4 ® By? } for M4 = —jA,.--,j4 and mg = —jp,.-.,jB, are 


common eigenvectors of all four operators with eigenvalues 


(A? @ 1)(aji4 @ BMF) =jaGiat lays @ BF 
(1 @ B*)(a5"4 @ BNF) = jalint lay @ pre 
(As @ 1) (a4 © BF?) =maajz4 @ By? 

MA MB MA MB 


1 
( @ B3)(aiH @ BI? )=mBaiAr @ Bi® . 


= Loe 


It also follows from the definition of the composite inner product that it is an ON-eigenbasis. 
That is, 


m! mi, Ps mi, 
(aia @ Bin Oy" @ By )12: =a. a | A) (Ogee Ore )2 
= Oa. fb 10; 31.0 


JAsI'4 ma,m', IBJIB MB mz" 


So we have an ON-eigenbasis for the eigenvectors of these operators. As we shall see, 
this basis shall be crucial to finding the eigenvectors of J? and J3, and so their spectra. 


15.3 Conversion to Eigenbasis in Terms of J’, J3, A? @1, 1 @ B? 


The first thing we note here is that not only is J? a Casimir operator of Ji, Jo, J3 but so 
are A? @ 1 and 1@ B?. This is seen straight from the linearity of the commutator bracket, 


[A? @ 1, Jj] := [A? @1,A;@1+1@B] 
— [A?@1,A;@1]+[A? @1,1@ Bi] 
a) 


as each bracket vanishes. Similarly for 1 B?. We also have, using Corollary 11.11 that 


® 


[J?, A? @ 1] = 0 = [J7,1 @ B?. 


We can therefore consider eigenvectors of J3 that are not only common to J? but also 


to A? @1 and 1 @ B?, and so we have a simultaneous eigenbasis, {¢". ,_} which satisfies 


I,J AIB 
J* ae =j a Bae JIAIB 
IC ge = as 
2 . . 
(AP OU) Ey ag = JAVA) Sage 
2 
(1@ Bem. in —IBiB+ VE, 5p 


Now since we already have the ON-eigenbasis 1c @ cies } for A? @ 1 and 1@ B? it 


follows (by the definition of a basis) that this new basis can be expanded as*? 


joe ~ - 4 ar ® Bre JB ” SS jijaiE) 5, “® Br 


MAa=—jJAMB=—JB 


Definition. We define the Clebsch-Gordan coefficients (CGc) as 
M,MA,NB ,__ eo 
oa — (a; @ Br JIB Serre rye 
and so we can rewrite the previous expression as 
— Pie NA,NB QitA MB 
ee ~ 3 Sc DIAIB OFA ® Bi : 


MA=—jAMB=—JjB 


32We shall drop the subscript on the inner product here to lighten notation, but obviously it is the one 


for the composite Hilbert space. 


= 153 = 


Remark 15.4. The Clebsch-Gordan coefficients are just complex numbers. Although they 
might be rather difficult to calculate in practice, the method should now be clear. All we 
need to do is calculate the CGc and then we instantly have our new eigenbasis, and so we 
get the spectra for J? and J3. 
m 
; ; ee 
appéats BOLH Gis tn EHS RE HAP d Ghee BEG e RRR cE as ast EA be CEL 
This forms the remainder of this lecture. 
The strategy is as follows: start from some convenient eigenvector Gide and its 
associated CGcs, then use the ladder operators to obtain the eigenvector o 2. a and the 


resulting CGcs. We will then change the value of 7 itself and repeat the process. In this 
manner we will build up a table of CGcs. 


Change j value 


Clebsch-Gordan 


coefhicients 


Apply ladder op rators 


15.4 Value of m 
Consider the action of Jz on both bases, 
Ma MB Ma mB 
J3(aj4 @ PIF ) = (m4A+ mP)(as4 @ GIF ). 
Then, from the fact that the CGcs are simply complex numbers and the fact that J3 is 
linear, it follows from the expansion equation that we require 


m=mAat mB. 


In other words, whenever m 4m,4+™m-4 B, we require that the CGc vanishes. We can, 


therefore, place this as a constraint on our summands giving us 


m _ M,MA,MNB (Ma MB 
ain= Sd, Chee AP (ale @ 67.7), 
MA,MNB 
matmp=m 


where we have left the ranges of m,4/mg out, but they are of course just —ja,...,ja4 and 


— 154 —- 


15.5 Clebsch-Gordan Coefficients for Maximal 7 


We are now in a position to choose our convenient initial eigenvector. It follows from the 
ranges of m4 and m z along with the condition m = m,+mg and m= —j,...,j that the 
maximum value 7 can take is 74+ jg. It follows then that 
and all other CGcs at the level cs ee vanish. Then from the fact that both eigenbases 
are normalised we know that 
| JAtIBIAIB |? _ 1 
JATIBJAJIB : 
and so the two eigenvectors vary only by a complex phase. However, seeing as we are only 
interested in eigenvalues here, and an overall phase plays no effect on the eigenvalue, we 
are free to set this phase however we like. We choose it such that 
JA+FIBJAJIB — 
Ciatindade — 1 
We can now start applying the ladder operators to lower the value of m = j4 + jp. 
Using Proposition 13.24 we have 


reo ; ; : ; ; ; : ; test 
J_EATIE gg = Vat iajat jet 1) — (at ja)Gat ia —- DEAT Fie 
Jpg Fea iat 
AGA IES ie sags 


However we equally have 


J_(aG, ® Bis) = (A_ 91418 B_)(ay, @ Bie) | 
= V2jaloht* @ BIZ) + V2in(als @ BIZ’). 


JB 


Then, equating these two, we obtain 


CuAtis—ljaja—1 _ JB 
: : i — ; =r 


with all other CGcs at this level vanishing. 
CIAtIB—2,—,— 


te re and iterate 


We can repeat this process to obtain the CGcs at the level 
until we reach m = —(j4 + 7g), which is where it must terminate. 


15.6 Clebsch-Gordan Coefficients For Lower Than Max 7 


We now wish to reduce j itself to produce the second column of our table. We first need 
to ask what the next highest allowed j value is. Recalling that 7 € No/2 we might try 
ja + jp —1/2, however this is not allowed. The answer to why follows simply from the 
fact that 74 and jp themselves are fixed, so all we can change is m4 and mz, which must 


change in integer steps. Combining this with them = m4+ mB, which holds generally, we 


— 155 — 


would not be able to get m = —j,...,7. That is, the next CGcs are of level Ce 


Then using the fact that there are only two ways to obtain this (either m4 — my, —1 or 
mp — mp —1) we have 


JAtjB-1 — CAtIB-1Jja—ljB( JA! JB JAtIB—-1JjAa,JB-l( JA JB-1 
ate —iade ~ ietip-igade (Ga © Pie) Ciatje-tings ‘MO Pjs 
We then use the fact that the eigenvectors in this equation are all orthonormal to obtain 


2 


) 


-— ieeem ig igs jejasiga ges 
1 Ce | + ICH ee 


and we also use the fact that the RHS eigenvectors are the same here as with the J_ case 
above however the LHS eigenvectors are necessarily orthogonal to give 


_CIAtis—lijajs—-! | Cjatis—ljajp—1 


CiAtis—lja-lis , Qjatis—lja—lijs JB | JAI 
JATIBJIAIB JAt+IB—-1,3A4,IB 


JATIBJIAIB JAtIB—-1,JA,JB 
JA | Ciatis-ljs-lie _ _|/|_ IB | Giatis—liais-l 
, ) JAtIB—-1,JA,JB , , JAtIB—-1,J4,JB 
JA T IB JA+IB 


jatip—lja-lis _ JB Cjatis—-ljajp-1 


JATIB—1,jA,JB JA JATIB—1,jA,JB 
Solving simultaneously, 


JB Pegg ti2 
(2 FL} || =e 


\Cjstip—ldade—1) _ —JA__ 
JATIBT HIAIB—1| — | |S 


}AtjBp—1,jAa—1,JB IB 
mire : : 
| jAt+IB—-1,J4,JB | jat jp 


Finally we just fix the phases as we want to give 


jatis—ljadjp-1_ _JA__ jatis—lja-lip _ _ JB 
jAt+jB—-1,j4,JB jat iB’ jAtjB-1,j4,JB jat iB 
We can then apply the J_ operator as before to move down this /column. We can 


repeat this process of loweriftg j7 again to obtain the third column, aid iterate until we 
reach 7 = |j4 —Jjgp|, where it must terminate. We see that this is the termination point 
quickly from m = —j,...,7 along with m = m,4+mg. On the next page I have included 
a table (from David J. Griffiths’ QM book) for some calculated CGes. As we can see... 
they’re not pretty things. 


15.7 Total Spin of Composite System 


We conclude, then, that in quantum mechanics when we want to compose a spin-7 4 system 
with a spin-78 system we do not just get a spin-j74 + 7p system, but instead we get the 


direct sum 


(spin-j4+37) @ (spin-34+3-1) @ ... ® (spin-li4-J? 1). 


\+ 


Zii+ 2 
aig 


= Lob 


gi- 0 
Se- SE | 2i- 0 
Se Ge | zi- be 
gir ge Zit + 
cw aS it~ ot 
4g 


esl Zt x2 


(‘TeSIpes sya apssgno sa03 ‘Juasaid JI ‘uss snuTW dy3 ‘AUD AIDA95 
JO} poojssapunN st usts Joos arenbs y) “syUsIDyJeO) ULPIO+)-YISqa|7)  *R"> TIAW_L 


Fa] ae 
Task x a/b 


= Lot = 


16 Quantum Harmonic Oscillator 


As has been remarked, the world and everything in it are quantum by nature. There is 
no ‘classical’ ball which we make quantum, there is a quantum ball that we approximate 
classically. Equally there isn’t a ‘classical’ harmonic oscillator which we use to construct 
Epa aea anti om che classical system bad GteGw pronstuae ug 1c nto he ouant um 
counter part. We shall demonstrate explicitly in the first section why this is not a good 
idea, but a quick argument explains it. 

Imagine you have some general theory. Of course you can obtain any special theory 
related to it by taking approximations/constraints, however you have no real hope of doing 
the opposite — you should not expect to be able to obtain the general theory by ‘unapproxi- 
mating’ the special one. Quantum mechanics is the general theory, with classical mechanics 
is the special one. It is therefore a ridiculous idea to try and obtain quantum theory this 
way. 


16.1 What People Say 


Despite the clear message above, people still choose to do such a thing; they take the 
equations governing a system classically and ‘replace them’ with the quantum versions. To 
be fair, it is not that the physics community are unaware of the above fact, it is simply that 
they argue ‘we only do it in special cases where we know no problems arise.’ 

However, even in said special circumstances, we argue, it is still a terribly misleading 
and potentially devastating (theoretically speaking!) idea. We shall quickly highlight why 
this is. 

The procedure is as follows. Take the function representing your classical observable 
f (p,q), where p is the position and q the momentum, and simply rewrite the function but 
replacing p with the quantum mechanical operator P and q with Q. 


f(p,q) ~~ f(P, Q) 


For example the energy observable for the harmonic oscillator 


1 mu? 1 pee 
h(p,q) = a? tape ee B= h(P, Q) ee 


QoQ 


It follows from 


P,Q: S(R) > S(R), 


with 
(Py)(x) = —thy'(x), (Qy)(z) = av(2), 
that 
H : S(R) > S(R), 
with 


— 158 — 


This often appears as 

i od. wes 
——— + — 
2m dx? 2 


or for a more general case (i.e. not a harmonic oscillator) as 


2 2 
A := — Fr, de +V(a), 


where V (x) is the potential associated to the system. 
This all looks very nice, and indeed it is correct, however there is a serious problem 
here. Classically we could add 


(pq — ap) g(P, 4); 


for some other observable of the system g(p,q) without changing anything, as the bracket 
vanishes. That is 


f (p,q) = f(p, a) + (pa — ap) g(P, 4). 


However if we then applied the ‘~~~~~»’ approach to this we would get 


FP, Q) = FUP, Q) + [P, Q]g(P, Q) = f(P,Q) + ihg(P, Q), 


which is obviously not true for general g(P, Q). So it appears that even in these simple cases 
where ‘there is no danger’, there is a serious theoretical problem. For this reason we shall 
just not do this at all, and instead simply define what we mean by the energy observable 
(or Hamiltonian) of our system and proceed from there. 


16.2 The Quantum Harmonic Oscillator 


In keeping with Axiom 1, we need an underlying Hilbert space; we use H = L?(R). We also 
have (in agreement with Axiom 4) an energy observable, known as the Hamiltonian of the 
system, 
H := ay ee ee 
2m 2 

However, a note must be made. If H is to be an observable, it must be self adjoint. 
But in the above expression we have used the essentially self adjoint Q, P : S(R) > S(R). 
This is not a large worry as we can simply take their unique self adjoint extensions. We 
still have a problem though. Although P oP and Q oQ (as the self adjoint extensions) will 
be self adjoint, their sum need not be, as the adjoint does not necessarily distribute across 
the addition. 

What we shall do is consider the essentially self adjoint operators throughout, and then 
at the end we shall present Theorem that allows us to conclude that H (constructed from 
the essentially self adjoint operators) is essentially self adjoint, and so a unique self adjoint 
extension exists. 

As above, we shall not employ a different notation for the self adjoint and essentially self 
adjoint operators, but instead infer which we are dealing with by considering the domains. 


= 


16.3 The Energy Spectrum 


Recall that the spectrum of an operator is given by 
o(H) =0,(H)Uo-(#). 


The aim of this lecture is to calculate oP(H) and show that o¢(H) = @. 
Definition. Consider the operators Q, P : S(R) > S(R). Then define ax : S(R) > S(R) 


via 
mw 1 
a a ——— ——_— _ P. 
+ Von OF Vv 2hmw 


Corollary 16.1. We can re-express the Hamiltonian as 


H= tw ( aya sae : side). 


Proof. The proof follows from direct substitution. Let 


Q = =, 6B = . 
/ 2h eo 
He (aQ — iBP)(aQ +i6P) += = id sce ) 


a’Q o Q+ 6?P oP+iab(QP — PQ)+ 5 ids ) 
a’Q0Q+ 6*P oP + iaB[Q, P| + aan) 


me? Q+>—PoPti = (ih) ide +5 ids. ) 


where we have used [Q, P] = ip idS(®). 
Proposition 16.2. The following commutation relations hold: 


(i) [a_, a4] = idgrpy, 
(ii) [H, a4] _ hwa+, 


(iti) [H,a_] = —hwa_. 


Proof. They all follow from direct substitution, using H as written in the previous Corollary. 


— 160 = 


[a_,a,] = [aQ + iB P,aQ — iBP] 
= a7[Q, Q] — iaf[Q, P] + iBa[P, Q] + 6? [P, P| 
= i2a6|Q, P| 


2h. 
= 97 tse) 


= idgiry, 
where we have made use of the linearity of the commutator bracket. 
(ii) 
[H,a4]= rasa +5 += 5 dsr R)> 
WF ie 
= hwla,a_,a4] + 3 lid scr), a4] 
= hw a,la_,a4] + [a+,a4]a_- 


= ine ids) ) 
= = hwa, 


(iii) This follows exactly analogously to (ii). 


O 


Remark 16.3. Strictly speaking in the previous proof we should have considered the action 


of the commutator on an element of S(R) and showed that the expressions hold for an 


arbitrary element. Doing it this way will return the same results, however this will not 


always be true, and so care must be taken in future. 


There are four more basic facts that allow us to obtain the spectrum in its entirety. 


We claim that, for the H-eigenvalue 1, the following hold: 
(i) Hap) = (E+ hw)(a+y), 

(ii) |laze| = [wll > 9, 

(ili) H(a_w) = (E — hw)(a_y), 

(iv) E > thw. 


Proof. (i) From the previous result we have 


A(a,p) = a,( Hy) + [H, a4]y 
= Kayvt hwaiy 


= (E+ hw)(aty) 


= 161 


(ii) Given that (a,)* = a_ and vice versa,°? 


J|axv||? = (appla,p) 

(b|(a4)*a4%) 

= Wa_axyp 

vlaza_p) + (|[a_,a4]e) 
= (a_wla_) + (bl ids vd) 


| 


IV | 
<2 s < 


where we used the fact that the inner product is non-negative definite in the second 
to last last step (i.e. the first term is non-negative). The result follows from taking 
the square root and imposing the condition that the norm is non-negative definite. 


(iii) This is done exactly analogously to (i). 
(iv) Consider 
BUbly) = |Ev) 
= (~| Hy) 
((vlayav) + 5 (lide ¥) 


hw 

= tea (avjav) + 5 (lids) 6) 
i 
2 


Then from the fact that ~ is an eigenvector (and so cannot be the zero vector), the 
inner product is non-vanishing and we can divide through by it, giving the result. 


O 


haverv@ can, thus, draw some conclusions. For any H-eigenvector, ~, with eigenvalue E’ we 

1. From (i) and (ii) it follows that a;w is a eigenvector, as (i) tells us it obeys the 
eigenvalue equation and (ii) tells us its not the zero vector. Thus we know that the 
sequence 


{ (a4) nen 


where the power indicates n-th order composition of operators, is a sequence of eigen- 
vectors with correspoding eigenvalues 


{E + Nw nen « 


33-To show this you need to consider the definition of the adjoint and work from there, as you don’t know 
that it will distribute across the addition in the definitions. 


= 12 = 


2. (iii) and (iv) tell us that the sequence of eigenvectors 


{(a—)"V }neno 


must terminate for some n = N € N. That is, we can not continue to keep lowering 
the eigenvalue EF forever, as (iv) says it bounded from below. Note this tells us that 


a-w is not strictly a H-eigenvalue (just as J+ weren’t for Q and J3). 
In other words there is a non-vanishing wo € S(R) defined as 


to := (a_)*p 
such that a_w = 05,p). It follows, then, from the definition of the Hamiltonian that 


hw 
Ayo = hwaza_o + => v0 


es 
~~ Dy) Wo; 
and so it has the lowest possible eigenvalue, by (iv). 


3. The entire sequence (as defined above) of eigenvalues is 


1 
hw n+ 2 


nENo’ 
Equivalently, we say the n-th cidgaveetor ) \ 


Wn = (a4) 


has the corresponding eigenvalue 


1 
Ey = h(n +5 ) 


4. Considering again a_w%o = 0g(p) along with the definition of a_ we have 


mu 1 d 
— 2 +i—= (- 7h) — y=, 
(| See + ity ) vol) 
which is just a ODE. We can solve this using separation of variables; rearranging, we 


have 
U(x) = = av0(z), 


which using standard separation of variables technique gives 


Mw 
In |o(x)| = =a nee 6, 


Wo(x) = te ean 
Wo(x) = Aew ar 
for complex constants C and A := +e”. 


Imposing a normalisation condition, we can then write the n-th eigenvector in terms 
of the n-th Hermit polynomial, Hy, as 


— 163 -— 


Corollary 16.4. From 4. we note that (up to the usual ambiguity of a complex multi- 
ple) there is only one eigenvector to each eigenvalue. That is we have the 1-dimensional 
e1genspace 


Eigp(En) = spane (Wn), 


which tells us not only that y° exists in the first place, but that it is unique. 

Remark 16.5. At the end of the last corollary we said that we confirmed the existence of 
wo in the first place. This might seem like a strange comment given the whole calculation, 
however it is actually rather important. To illustrate why Dr. Schuller mentions a doctoral 
proposal he once saw in which the student had derived some truly impressive formulae, 
only to have someone point out that towards the start of his calculation he had 0, and so 
everything that followed could have just been a repercussion of that (i.e. 0-n = 0 for any 
n in your space). It is therefore to check that the things you are using actually exist, in 
this case wo doesn’t vanish and so is an eigenvector. 


The Hermit polynomial expression is equally an important result as it tells us that 
Un € S(R) (as all polynomials are in S(R)), which it needs to be if we are to act on it with 
our operators. Moreover, one can show that the set 


{tp |n cs No} 


is an ON-eigenbasis for L?(R), which leads us to the theorem promised at the start of the 
lecture. 


Theorem 16.6. If a symmetric operator has as its eigenvectors an ON-basis, the operator 
is guaranteed to be essentially self adjoint. 


This theorem tells us that H is essentially self adjoint, and the fact that we have an 
ON-eigenbasis for L?(R) tells us that the continuous spectrum is empty. 


— 164 —- 


17 Measurements of Observables 


So far we have discussed the spectrum of an observable, which tells us all the possible 
measurement outcomes, but tells us nothing about the actual act of taking a measurement 
itself. This comes through axioms 3 and 5. In order to illustrate these two axioms we will 


ted] th tum h i lat le, b 
repea, ed ly juse the Pita parmonic oscillator as ap. example, by 


it is important to no 
e methods are no is case. Any restrictions require im hi Ret. ae 


or the methods to 
will be clearly stated. 

This lecture can be read in two ways. One could read sections 4 and 5 first (on how you 
prepare a given state) and then return to read sections 1-3 (on how you take measurements 
of this state); or one could simply read it as presented (i.e. 1-5). Both reading orders 
have their advantages, but we present it here in the order taught by Dr. Schuller. Also 
in correspondence with the lecture given, we shall also translate some of the notation into 
the commonly used bra-ket notation (see lecture 4), even though we do not use it in this 
course. All these expressions shall appear in blue. 


17.1 Spectral Decomposition of H For The Quantum Harmonic Oscillator 


Recall: We found an ON-basis of H-eigenvalues which we labelled w,, obeying 
Ay, = Enn, 


with 


More precisely we derived 


and 


The only thing we will actually use in this lecture is the fact that the {qn} is an 
ON-basis, 
(Yn|Vm) = Onm; 


and the fact that the spectrum is given by 


aye {tw (n+ | In E No. 


The key to understanding measurement theory in quantum mechanics is that you know 
the spectral decomposition of the observable(s) you want to measure. In order to obtain 
the spectral decomposition of H we consider the projectors 


= 165.= 


Note that this operator is bounded as 
|| Ply) II | brle) PF llbnll? bal)? 


sup = sip ———— = Sp — 
ven lv ll3y yen IIel|? veH lvl? 


We can, therefore, employ the operator normon (_ ) to decide convergence of the following 
sum with the result LH 


CO 
2 ie 
n=0 


Definition. For every Borel set 0 € R, define 


PQ) = \ Ps 
EneO 
i.e. the sum over the projectors such that the energy eigenvalue corresponding to the state®+ 
corresponding to w, is within your Borel set. 


Remark 17.1. From now on we shall drop the FE, € on the sum, to lighten the notation, 
but it is important to remember that it belongs there whenever we use Pry. It will prove 
highly instrumental to the results that follow. 


Example 17.2. Let Q = {Ey}, ie. just the set containing the single eigenvalue E,,,. Clearly 
then 
PHO) =P 


Example 17.3. Let Q = {Em, Ex}. Then we have 
PHO) =P Pe 


Proposition 17.4. The map Py : o(R) > L(H) is a projection valued measure, and in 
fact corresponds to the projection valued measure that appears in the spectral theorem for 
H. That is 


H= .\Px(d\) 


wi Pneci 


=0 
oe) 


<Enieel) dn) (tn 


=0 
Remark 17.5. The above proposition makes sense. The Hamiltonian (the energy operator) 
is given by the energy eigenvalues multiplied by a projector that projects the state into 
a state whose energy eigenvalue was the prefactor. This is clearly just the eigenvector 
equation. 


34 Again recall yn are not the states themselves, py,, are 


= 166 = 


Remark 17.6. Note there was nothing specifically special about the fact that we were con- 
sidering the Hamiltonian above. Indeed the same method holds for any observable you 
wish to measure. First find an ON-basis of eigenvectors for your operator, A say, and then 
define the PVM associated to the observable 


PA: a(R) > L(H) 


in the same way and then plug it into the spectral theorem. 


17.2 Probability to Measure a Certain Value 


As stated in the opening of this lecture, the method is the same for all observables and 
their measurements, but we shall use the energy of the quantum harmonic oscillator as our 
working example. 

Recall, the spectrum of AH is the set of principally possible measurement outcome 
results. We want to show this pictorially, as it will help with understanding what’s to 
follow. 

As we measuring anything in the real world, we need some kind of measuring device. 
You feed in the thing you want to take a measurement of and the meter on the device tells 
you the measurement value. The one slightly different, but highly important, difference 
to note with quantum mechanics is that the device potentially alters the thing you were 
measuring. We shall draw this as follows. 


»—[) “ie 
H 


The 7 tells us that it is the device associated to the observable H, the scale markings 
tell us the spectrum®®, the arrow tells us the actual measurement made, py is the state 
before the measurement and pg is the state after the measurement. 


Remark 17.7. Note that the pointer here will not move continuously between the notches; 
it moves between the values by jumping between them. In other words, it can no point at 
in between two notches, as this would not be part of the spectrum. 


Recall: a state of a quantum system is a self adjoint operator that is: 


3° Here we are considering the spectrum of the harmonic oscillator, and so the notches are evenly spaced. 
Clearly this will not always be true. In general we have notches of varying separation as well as ‘blocks’ for 


the continuous parts of the spectrum. 


167 = 


(i) Trace-class: Tr(p) = D0. (en|p(en)) < 00, where {en} is any ON-basis. 
(ii) Unit trace: Tr(p) = 1. 
(iii) Positive: Vp € H, (ylo(y)) > 0.°° 


Also recall: Axiom 3 assorts that the probability to obtain a measurement outcome (a 
‘pointer position’) within a Borel set Q. when measuring an observable H on a system in 
state p is given by 

Tr (Px(Q) fo) p); 
where Py is the unique PVM from the spectral decomposition of H. In terms of our picture 
it asks the question ‘What is the probability that the pointer points within the range 2 on 


the scale?’ 


Remark 17.8. We wish to emphasise this point again here. The spectrum of an observable 
only tells you the possible measurements and the results of last lecture give you information 
on the probability of each possible measurement. This is where the probabilistic nature 
enters into quantum mechanics. When a measurement is made, the result is concrete. You 
get exactly that result. This in tern effects the state of your system, giving a (potentially) 
new state. This is where the indeterminate nature of quantum mechanics enters. 

That is, prior to the measurement you can only say with what probability you get 
one of the possible final states, but once the measurement is made, it is exactly that one, 
and which final state you get depends on which measurement result you get. This is the 
quantum behaviour of the system. 

As we shall see in section 5 there is another form of probability concerned with quantum 
mechanics, but this probability does not stem from the quantum nature of the system itself. 
It stems from the ‘ignorance’ of the experimenter /the equipment in order to be able to 
distinguish which measurement was made. This results in what are known as mized states. 


17.3 Measurements of Pure States 


One can think of pure states as the most precise information one can obtain about a 
quantum system. Recall that any pure state can be written as 


(PL) _ WY) I 


Pe Th) WI)” 


36 
Note that this should really be called ‘non-negative’, however this is just how it is named. 


= 165 


for some w EH. 


Remark 17.9. Again we emphasise that people often refer to w as being the pure state itself. 
This might still seem forgivable, but as mentioned previously this is in fact uncountabley 
infinitely incorrect, as we have a complex scaling ambiguity: for any A € C \ {0}, 


Pr = Pr- 


One might then say ‘Ok, just take the normalised ~ elements,’ but again this is still incorrect 
as multiplying by e’® for a € C would still given the same result. One could say, then, ‘a 
state of the system is given by an element of the Hilbert space, up to arbitrary rescaling.’ 


Now lets employ Tr (Py(Q) o pg) to calculate the probability to measure a certain 
energy of the harmonic oscillator for the state p,. As we are dealing with the harmonic 
oscillator, which has a purely discrete spectrum, we can simply make our Borel sets such 
that they contain only one measurement (one notch on the scale). We have then, for some 
ON-basis {e,, } 

Tr Pa({Er})opep = = en Pu({Ex})o py en - 


Now seeing as €,, need only be some ON oh are free to use ie ON-eigenbasis {wr}, 
giving?’ ( ) 


Tr (Pu ({Ex}) G pi) = 2 (thn|(P H({Ex}) 2 i>) Wn) 


= oe (onl Yo mleo(tn)) # -) 


aS (dnl (belo(Yn)) Ve) 


5 ( (fo) on) 
_ Lehn - . ‘ be 


a |) 
ak 


Leplan) | ) |? 

IIel|? 
— (Peel? 

ell? 


where we have used the fact that ||~,|| = 1 to get to the last line. 
We now note that although we no not require » to be an eigenvector of H, we can 
always express it as a linear combination of the ON-eigenbasis {~,}. The following two 


| 


37Recall that in the definition of Py the sum is taken such that the energy eigenvalue with that index is 
within the Borel set. 


= 169 — 


examples shall highlight this point and demonstrate how one could almost instantly deter- 
mine the probabilities of obtaining a given energy measurement given the expression for y 
corresponding to a pure state. 


Example 17.10. First imagine that y is an eigenvector of H, then we clearly have 


p= Ave 
for A € C and some fired ¢. Plugging this into the formula we obtain 


2 
Tr (Pu({Ex}) © pg) = PE 
— bel Ade) val? 
|Al?|| Pell? 
[A ele) File? 
7 |Al? 
= Oke; 


so the probability of obtaining energy measurement EF, vanishes unless We = wy, in which 
case we are certain to get that measurement. 


Remark 17.11. In the above we could call y an energy-eigenstate. In this case the use of the 
word ‘state’ is truly forgivable as all the other eigenvectors with the same eigenvalue would 
produce the same state (as in both cases we have the same scaling ambiguity). However, if 
we wish to avoid confusing ourselves, we need not say this. 


Example 17.12. Now let’s assume y isn’t a H-eigenvector, but is a linear combination, say 
y= Alp + Bog, 
for A,B € C and p #q. Then we have 


PC Atp + Boba II? 
|Avp + Bibgll? 


_ Wildy + Bog) Vall? 


_ oP B eee 

~ (AP + BP 

_ |Adgp + Boxg|? 
[AP + [BP 

= |A|?Skp |B|?5kq 

“(Ale Blt AP a| BP: 


Tr (Pa({Ex}) © py) = 


where we have used the fact that (wW,|,) = 0 in the denominator and then in the last step 
used the fact that dzp0pq¢ = 0 as p # q. 

So we see that the coefficients A and B tell us the measurement probabilities. The 
extension to liner combinations with more elements follows trivially to give: if 


p= Gis 42, 


a 4 | 4 


== 


for c; € C then the probability to measure energy Ex is 


Deileil"Sei _ _ lexi” 
dai leal? ile?’ 


Tr(Pu({Ex}) © py) = 


17.4 Preparation of Pure States 


Axiom 5 asserts that upon measurement of H the state pp is projected to the state pg, given 
by?8 
oe Pr(Q)poPa(Q) 
“Tr (Pa(Q)p¢Px(Q)) ’ 


if Q is the Borell set in which the actual, really observed, really having happened, measure- 
ment (pointer reading) came to lie — that is it depends on the measurement result (past 
tense!). 

This fact, however, can be used to our advantage in order to prepare a state of our 
choosing. We shall consider the following two cases separately: 


(i) The observable H with a discrete, non-degenerate spectrum, 


(ii) Allowing for degeneracy of the spectrum. 
For (i) we can prepare a pure state py, where 7, is an eigenvalue of H by the following 
device 


Measurement Output 


pp — - Filter k G — Pe 
H 


We feed in a general pure state of our system into the H device, which measures the 
energy of that pure state. It then sends this measurement output into the filter device. 
The state post measurement is then fed into the filter device, which is designed to only let 
something pass through it if the measurement was Ex. In this way the only pure state that 
can leave is py,- 

Note, although the final output is guaranteed to be py,, that does not mean that the 
output from the H device is always p,, — it could be any of the possible output states. It 


38This is known as ‘wave-function collapse’ in the literature. Dr. Schuller does not like using the wave 
analogy and so, if anything, he called this ‘collapse of the state’. 


ee Ls 


is also important that H is non-degenerate, otherwise the filter would let multiple different 
states through, all of which gave FE, as their measurement output. 

For (ii) we allow for degeneracy of H. We overcome this by considering a maximal set 
of mutually commuting observables, {Aj,...,Az}, for which there are common eigenvectors 
War yf with 

Aiay,...5ay = Qihar,...a5- 
The maximal set means that these states are uniquely determined using these operators; 
that is we have a subset of eigenvectors which we differentiate using this maximal set of 
commuting operators. The device looks like 


17.5 Mixed States 


Mixed states encode the ‘ignorance’ (in the sense of lack of knowledge) on behalf of the 
experimenter /equipment. For example, the experimenter’s inability to see exactly where the 
pointer is pointing, and so takes a guess at the reading. This introduces further uncertainty 
into which final state we obtain. It is important to note, though, that this is not and 
inherently quantum mechanical property. 

The typical set up for preparing a mixed state is as follows: 


Probability 


Generator i — PP, + (1 — P) Pde 


The ‘probability’ generator here is some method of choosing which input (left) to output 


(right). where there is a probability n to use the ton inpnut (Wk). For example it could be a 


== 


person rolling a dice that says "If I roll a ‘1’ then I shall use the top input, otherwise I’ll 
use the bottom one," in which case p = 1/6. Normalisation is taken care of by requiring 
the other possible outcome to have probability (1 — p). 


Remark 17.18. It is very important to realise the the output for a mixed state is the sum 


of two states; it is not the state made from the sum of two eigenvectors, as was the case 
with Example 17.12. That is 


PPd, + (1 — P) PA), F Pod, +(1—p)ve- 


We highlight this point here as it demonstrates one of the misleading aspects of using bra- 
ket notation. People often talk about a pure state as one that can be written as a linear 
superposition of the eigenstates (as with Example 17.12), writing 


|W) =a lve) + ble), 


for a,b € C and k # &, where the normalisation condition requires |a|? + |b|? = 1. But 
if were to think of |W) as the state then this would look like a mixed state — it is the 
superposition of two pure states. 

In order to differentiate a pure state from a mixed state they introduce the density 
matrices, which are the ps we’ve been using, and say that the density matrix of a mixed 
state is of the form 


Poked = dP i) (Wal, 


where p; is the probability of being in the corresponding state, but then going back to the 
start of section 17.3, we see this is just the same as what we wrote for a mixed state, without 
any of the potential confusion. 


ies 


18 The Fourier Operator 


This lecture will begin our systematic approach to the study of the so-called Schddinger 
operator: 
H : Dy > L?(R%), 


with 


where A := 02+ ...+ 03 is the Laplacian operator and V(x) is the potential. The Fourier 
operator is an indispensable toll in conducting the study. 

We will start by expanding on Lemma 12.19. We will then use the fact that the Schwartz 
space is densely defined on L?(R*) along with the BLT theorem to provide a proscription 
for taking the Fourier transform on L?(R2). 


18.1 The Fourier Operator on Schwartz Space 
Recalling the definition of the Schwartz space, it is clear that the following facts hold: If 
f © S(R%) then 

(i) Q* f € S(R%) for all k = 1,...,d, where Q* is the k-th position operator. 

(ii) P,f € S(R®) for all k = 1,...,d, where P; is the k-th momentum operator. 


We shall use these facts in the following calculations. 


Definition. The Fourier operator on Schwartz space is the linear map ¥ : S(R¢2) > S(R2) 
with i 
.) = ——— dtye—**Y ; 
GN) =o | ave FC) 
where ry := ©1Y1 + ... + Laya- 
Remark 18.1. We are using the 1/(27)4/? convention in the definition above. As we shall 
see, all that is required is a ‘total of’ 1/(27) between the Fourier operator and it’s inverse. 


This convention is often used as it makes comparing to the inverse easier. Other conventions 
(for example having 1/(27) appear in the inverse and just have unit coefficient in the above) 


find use in certain cases (for example if you were only concerned with taking §). 


Remark 18.2. We shall also called the action of the Fourier operator on a function f € S(R?) 
the Fourier transform of the function. 


We now wish to make some remarks on notation. 


(i) Particularly in physics, it is often intuitive to think of f as a function on ‘position 
space’ (in the sense that its argument is a position space), while thinking of the 
Fourier transform § f as a function on momentum space. Thus, one often relabels the 
variables as follows 


ott Le 


While this does have its advantages at times (the famous example being the motivation 
behind the derivation of Heisenberg’s uncertainty relation, a truly vital relation in 
quantum mechanics), it can also lead to misconceptions. For example, when thought 
of this way, one may think that you can not take the double Fourier transform §(¥/), 
as the first one gives a momentum space, which the second does not ‘act on’. However 
from the definition given, this is clearly nonsense — of course you can take it twice 
as *: S(R*) > S(R®). 


We shall, however, stick to this notation, but we should not be fooled by what we can 
take the Fourier transform of because of it. 


(ii) Recall that (Q* f)(x) = x* f(x) which is just a real number. It is therefore totally 
meaningless to write something of the form 


Sz F(@)). 


However having to define the operators each time and then taking the Fourier trans- 
form of their action on a function could end up quite lengthy, so instead we introduce 
the following notations 


ak f(x) = (QFf), 
and 
B(a + a* f(2)), 
and similarly for other operators. 


The former of these two is how one usually sees the Fourier transform written, and it 
often just written as 


g:= 89, 


where g is the result of the action of the operator on f (so here g := Q*f). 
Proposition 18.3. Let f € S(R¢) andy €Né@. Then 
(i) & ( es =p $f ©). 
(ii) {a 2°.f) tb) = sh'[A((8)) J. 
Proof. We shall prove these both by induction. 


(i) Let y= k, and so |y| = 1. Then, using integration by parts, we have 


B( — iO.) = ae [ine (I) (Oc) (a) 
l ; —ipz(__4 =! —ipx 
=- Gyan [diame Me) + are TO) 


1 —ipx 
= peop [tie PY f(a) 


ee 


where we have used the fact that the elements of the Schwartz space are rapidly 
decaying to remove the boundary term. 

Now assume it is true for |y| = n. Then, if 7 is the next step, from the fact that 
Op f € S(R2) we have 


B (i)mtonan...dm+ f. (p) = F (—i)nOndr...dm (19+ f), (p) 
( = = a Prin * B(- 1Ormmssf) (p) 
= Pyy-PrmPrnss * (SF) (P) 


=: py - (Bf) (p) 
(ii) Again let y = k, then 
3( = a* f(x) (p) ‘ 1 L dt xe'P* x ae dl (x) 
x = (ana 
1 —1px 
~ (Qn)d/2 5 fa ingle , )f(@) 


= lam : a | 4 Ape APO fq ) 


= 4 ApRrsf (p) 
Then using the notation O; gives the a ies) In fact, we should have really written 


Ox (p +> e~*P7) on the second line — i.e. you take the derivative before you evaluate 
at x. Now assume its true for |y| = n. Then, if 7’ is the next step, we have 


Sa. (7) p= 3(2 Soe yr f(y) ) (p) 
=i" (O,,...y,5(y > yf (y))] (p) 


=i" "Oy (EF)] (p) 
OU 
Proposition 18.4. Let f € S(R®). 
d 
(i) Leta eR , then f(a —a)(p) =e" - f(z)(p). 


(ii) Let X EC, then 
FO@) = GFE(4). 


Proof. (i) Using the change of variables y = x — a, we have 


f(e—a)(p) = OE a d°xe~‘P* f(x — a) 


1 —4 a 
= Gn i dtye—PYt)) f(y) 


—ipa J dj, ,—tpx 
= Ee apm |e Be F (on) 


ipa 
— a £ (m\(m)\ 


te FTI NES) 


= t= 
where we relabelled y > x again. 
(ii) Using the change of variables y = Ax 
1 
f(Ax)(p) On) pe d¢e~P® f(x) 
1 1 aa 
- aia pt 
1 


1 ike 

M4 (2n)a/? [ atee a f(x) 
a 

-xf@(%). 


where again we relabelled y > a. 


18.2 Inverse of Fourier Operator 


Remark. 18.5. This is often also called ‘the inverse Fourier transform’. 


R4 C 
Lemma 18.6. Let x € andz€ with Re(z) > 0. Then the following is true. 


ih 1 
a. Bad Ss ea 
cw (~#2)91= pow (Zr) 
Proof. We shall prove this for the case d = 1. Let 
z 

G(x) := exp ( — =z"). 
Then we have 

(0G) (x) = —zaG,(x) 

ap ($G,) (p) = -1z [a(%G.) | (p), 

which is an ODE for §G,. Solving by separation (as done when considering the quantum 


harmonic oscillator) we arrive at 
2 


(3G.)(p) = Aexp(- 4). 


Plugging in p = 0 and the definitions for the LHS gives 
1 
—— | dx1-e~3” =A. 
V27 JR 
Then employing the fact that the integral above is holomorphic?? we we extend the standard 
integral result 


ion? us 
a dxe = \/ =, 
R O 


3°See ‘Fourier Series, Fourier Transform and Their Application to Mathematical Physics’ by V. Serov 


Canter 16 


ee dee oe 


for 0 €R to the case we are considering, ae 


—s 


Ve 


Theorem 18.7. The Fourier operator § : S(R¢) — S(R4¢) is invertable with inverse 
1 
—1 == d,, tipx 


Proof. Need to show that [$~'(#f)|(z) = e In order to do so, we shall have to 
introduce a regulator 


7 Jb? 
lime 2? =1 
E—0 


into the integral. We shall then use the fact that (f)(p) will be dominant and the fact that 
we are using Lebesgue integrals to pull out the limit. We shall also use Fubini’s theorem to 
move the order of the integrals. 


[5 '(A)|(@) = aE I dpe” (&f) (p) 


1 d —£p? ipa 


= (Qr)V/2 pad plinve a 5, (p) 
1 we j 
= lim oe | tive Be (SAO) 


e—0 (27) 


_—_ ads: I d,,,—£p* ipa ] d,,—ipy 
= lim mae [2 pe 2? e (ona a? ye F" f(y) 


6 d,__ 1} dps P* ppt ,—ipy 
ee (27) 4/2 Le (2m) a7? [tre oy Iy) 


4: 1 d 1 d en ee ( = ) 
= oth mya [2 VO a72 [ate 2P" eo DY—-2 f(y) 
1 it ae ee 
— Lim omar [2 Yonap [2 pea?" eX e— PY f(y) 
il 
d, 


_ |; ! dy Sp = ipZ 
=a (2Q7/)4/2 aoe (27)4/2 age Us “e aa 


_—iim—_— rf qd _ £9? 

~ EBb omy fe 7]? ( pe f(zt+2) 

—jim— 2 f qdp te wy? 

aan FEA ‘ ors Ie) 

ae sen d,t_d/2_+_ Ly. 2 ; 

i aaa fe a (— 5, (ee) (Vee +2) 
1 


=aon |. az! exp (—5 (2 ‘)?) tim f (Vez! + 2) 


5 (20)? F(a) 


| 


(2m) 

= f(x), 
where we have used the substitutions z = y +z and then z = \/ez’ along with the standard 
0 


Smt AnMa] saat saad Se PRA ARASH e- LAM waa 


peeled co [ag AL £POevuueltv UVIvui tit ULLWY rev £VUUY AU 441111. 


== 


18.3 Extension of 3 to L?(R“) 


We already know that $ is densely defined on L?(R2), so if we can show it is bounded the 
BLT theorem will tell us there is a unique, bounded extension of ¥ on L?(R?). 


Theorem 18.8 (Parseval’s Theorem). Let f S(R“), then 
E 
2 
[leno =f ait@e 
Ra Rd 
Proof. The proof follows by direct calculation. 
[ccnp =f ar] | atc” F(0) 
Ra Ra Ra 
= ap | dtale-?*f (2) 
Ra Rd 
=f a'r | atalgey? 
Rs Rd 


2 | dt'n| f(x), 
Ra 


where we used the fact that the integral is over a real domain. LJ 


2 


| 2 


It follows from Parseval’s theorem, then, that 


2 
Isp ex ow Ofc 


fes(rty IIFIScmay 
Sroa d4p| (Sf) (p)|? 


= sup 
Fes) \/ fou dtalf(a)/? 


so we have a unique extension ‘| 
STAR) 17 (RR): 


In practice if we wanted to take the Fourier transform of a function f € L?(R®) \ S(R2) 
then we do it via the following prescription: Let {fn}nen C S(R%) with limnoo fn = f, 
then 


$f =5( lim fn) = lim (6fn), 


where we have used the fact that § is bounded to remove the limit. 


=o 


18.4 Convolutions 
Definition. The convolution of two functions f,g € L'(R%), written f * g, is the L'(R?) 


function defined pointwise by 


(f g)(z):= dtyf(z  y)g(y). 


‘ Rd _ 
Lemma 18.9. The convolution of two funftions is symmetric, 1.e. 


feg=agrf. 


Proof. The result comes from simple change of variables along with the commutativity of 


the complex multiplication, 
(Fea (a) i= fay Fe alw 
_ (__1)\2d d, z Ln—2Z 
= (-1)™ ff ate Fle)g(e 2) 
= f dzge- af) 
Rd 


d 


= Rd yg(x—y)f(y) 
7 y #* f(a), 


where the (—1)?¢ term comes from the fact that dy = —dz along with the fact that the 
integral limits swap. Since x € R@ was arbitrary, we have the result. O 


Lemma 18.10. The convolution is associative, i.e. 
(fxg) *h=fx*(g*h). 
Proof. By direct calculation using Fubini’s Theorem, and the previous lemma: 
((f #9) *h)(v) = (h* (f *g)) (2) 
= [dyna (Fao 


| 
ao 
ca) 
* 
a 
YS 
* 
Ss 
o——~ 
8 
wa 


Td PAG EOTN KY) 
( ye 


= 16 


which holds for all x € R42, giving the result. 

Lemma 18.11. The convolution is distributive across addition, 1.e. 
fe(gt+h)=fegt fxh, 

where the addition is defined pointwise. 


Proof. This follows from the linearity of the Lebesgue integral: 


(F (9+ mM (0) = f aty fe wlo+ NC) 
“he ic * fe fa — y)h(a) 
g)(x) + ” « h)( 
which holds for all z € R®, giving the result. 0 


Theorem 18.12. The Fourier transform of the convolution of two functions is proportional 
to the product of their Fourier transforms, explicitly 


Bf #9) = (20)? 3(f) - Bg). 


Proof. By direct calculation, 


BUF +00) = Baap | dle MF x a0) 


= = TE i ; dty Ge [ ; d%y f(x - iol) ) 
oom |, (<r [evse\ate)) 


1 —ipz —1 
= aon [2 [ave P* f(zje “P¥g(y) 


d/2 1 d ipz 1 d ipy 
= (27) (27)2/pi pad ze f(z). - (Qr)2/pi pad ye gly) 
=ex(sno)-fiorm, ) (| ) 


where we have used the fact that we can consider the convolution integration variable (the 


| 


y) as aconstant when relabelling the Fourier transform variable, and used the fact that the 
Fourier transform of a function is finite to make the integral of an integral into a product 
of integrals. Finally since this is true for all p € R@ the result follows. ‘= 


= 181= 


19 The Schrodinger Operator 


The Schrédinger operator for a vanishing potential if given by 


and it corresponds to the energy observable for a free particle*? of mass m. 

This lecture aims to derive the spectrum of Hfee and use it to study the time evolution 
of pure states. We shall consider d = 3 throughout this lecture, and shall make use of the 
results of last lecture heavily. We shall also use units such that m = 4h? in order to lighten 
notation; i.e. Hfree = — A. 


19.1 Domain of Self Adjointness of Hfree 


If we want to talk about Hfee being an observable, we need to show that it is self adjoint 
on some domain. From (i) in Proposition 18.3 it follows that 


3(— Aw)(p) = |p|?(Fd)(p) =: (P?v)(p), 


where |p|? := p? + p4+ p8. In other words, 
(— Av) = P*d. 


Remark 19.1. The physicists says "in momentum space the Laplacian acts simply by mul- 


tiplication of the norm of the momentum, |p|?". 


We can now rewrite the above by inserting id j2p3) = —1¥ to give 
5° Hires 0 FO Fy) = P? 0 Su, 


from which is follows 
FH rec’ = P?, 


whose maximal domain is 
Dps = {ib € L?(R¥) | P?b € L7(R%)}. 


Theorem 19.2. A mazimally defined real multiplication operator is self adjoint on its 
maximal domain. 


Proof. Let 
A ‘DA SH 
pr ap 


40Free particle here means what we think of classically as a free particle, it experiences no potential. 


= 182 = 


where a € R and D is the maximal domain of A (ie. there are no elements outside this 
domain such that aw € H). This operator is clearly symmetric as, for W,yp € D 


(pl Ay) = (lay) 


We have therefore A C A*, which means D4 C Dy with A*w = Ay for all ~ € Da, but D4 
is maximal so there is no w ¢ Da, such that A*w = aw € H and so Dx C Dy. Therefore 
A= A* on this maximal domain. LJ 


From this theorem, then, we have that FHfrec8 | is self adjoint on the domain D p2. 


19.2 Spectrum of Hfree 


One can quickly find the spectrum of Hfee using the resolvent map, 


a a 
p2(z) pl? — 2’ 


from which is clearly follows that the resolvent set is simply 


p(P*) = {z €C|zF |pl"}. 


Then using the fact that |p|? € R with |p|? > 04! we have 
p(P?) =C\ RG. 


Then finally using the definition of the spectrum as the compliment of the resolvent set we 
get 
O( Hee. =a) =R,. 


Kean Oreily Mids Re HA ACE SSH MOE GES Rp rouide I I «One Lene SRS 
is introducing a form of the spectral theorem in which the integral is performed over the 
spectrum of the operator and then use the characteristic function Dr. Schuller introduced) 
I shall not type it up here to avoid potential confusion to the readers. If you do follow the 
complete method please feel free to contact me and I can add it and give you credit. 


Proposition 19.4. For every self adjoint operator there is always some transformation 
which transforms the operator into a mere multiplication operator. 


Remark 19.5. Once you know the transformation the self adjoint operator of interest, the 
spectrum always follows by the same method. 


4'We use a strict equality as if |p| = 0 then there is no momentum and so the operator P? just maps it 
to 0. 
A2 


tf VOITS IUCIeS 


= 183 = 


19.3 Time Evolution 


Recall that Axiom 4 tells us the evolution of a state is given by*? 


Pres = oe i e.. et a-f) 


which for a pure state becomes 


(Ptal:) _ itt) #_MPal:) iba 
(Wty Pt.) Yt © (Wty, es) Ya : , 


If we now choose to view the RHS as the following composition 
1 


(We |Wep) Co (Wt |+) Jo( drei) _ wes a Ie, ) ) “ (va earnry, 


it follows that we can represent the time evolution of a pure state via the evolution of the 
Hilbert space element as 


to — e tat) Hy, 
Remark 19.6. We are assuming here that H is time-independent. If it wasn’t you would 


simply use an integral in the exponential. 


Remark 19.7. We should note that in the above time is viewed as a parameter, not a 
coordinate — i.e. this is not a spacetime picture. The elements 7, and 7, are simply 
elements of the Hilbert space, each of which is associated to a different time. This can be 
compared to saying classically that the position of a particle is an element of R°, and at a 
later time its position is still an element of R°, although potentially a different one. 


Now for the free particle we have 
SP § = ies 
which, along with the fact that P? is a multiplicative operator, gives us 
e tHiree x1 _ 1 eit? 
Acting both sides on w = §w gives 
etHivoys = F'(prs eM? (p)), 
which a convolution of functions. It is then tempting to use use 


fxg = (2n)V?3-1 (Ff) - (F9)) 


to give us“4 


e tH trey) (x) —_— 


ana (7 (pe e~t#lpl?) 2 ») (x), 


43 We're using units such that i = 1. 
44 


sn en: Cece’ % i ey 


VENICE Der UY — 9 Here. 


— 184 — 


and then use Lemma 18.6 to give 
g1(p4 etl’) (g) Se 


however there is a problem with both of these steps. 

Firstly for us to use the convolution theorem we require both functions to be L1(R3). 
For w we can simply take the intersection y € L?(R?)NL1(R*), however the exponential term 
is unavoidably not in L'(R*) (if you take the absolute value you get 1, and then integrating 
over all of R° gives a divergent result). On top of that, in order to use Lemma 18.6 we 
require the real part of the coefficient to be strictly positive (in order to avoid the branch 
cut), but Re(zt) = 0. 

Luckily we can fix both of these problems with the same step, regularisation. We 
regularise both by introducing a positive, real factor into the exponential and then taking 


the limit, 
ew itll? — lim ew (tte) |p|?” 
e—0 


The addition of ¢ stops the integral diverging (because of the minus sign) and we also have 
Re(it + €) =e > 0 and so we can use Lemma 18.6. 


So, using the continuity of the product and the inverse Fourier transform we have 


: 7 2 
e tHtreey) (a) _ im (« Ky Oa? Gus) exp ( = Gas) * ¥| (x) 


a ——— Sy ex _ deal 
= ea (4n(it +2))°? ha’ : P( fag). 


Finally using dominated convergence to take the limit inside the integral, we have* 


He al? 
bul0) = Gepan | eve (- EGE ono. 


19.4 Physical Interpretation (for the Patient Observer) 


After massaging the above result a bit we arrive at the famous ‘spreading of the wavefunc- 
tion’ result. 
First start by expanding 


jz — y|? = |x? + |y|? — 2ay 


to give 
= a 
Vt2(x) = exp (— Tir) | dy exp ( —ixy)exp( - yl Vey), 
: (i4rt)3/2 Ips 2t i a 
which is a Fourier transform with result 


om (~ 1) 


vale) =! oxo (— un n(Z). 


INOTE & = Lb” — ev” Nere. 


= 185= 


We now use the fact that we have a patient observer (i.e. one who watches for a long 
time) to take the asymptotic behaviour*® to give 


a|? 


exp ( — iz) P 


Braz) ~ Gans? 


which, if the ws were viewed as ‘waves’ (i.e. plots on a ert) would indicate that the ‘wave 


spreads out over time’. In other words, simplifying to R\instead of R? we’d have something 


along the following diagram. 


We, 


IR 


So the function appears to spread out (keeping the area under it constant) over time. 


46 


inat 18S take the limit € —~ © at places wnere lt wont Cause problems. 


= 166:= 


20 Periodic Potentials 


This lecture aims to look at periodic potentials and find the most general information we 


can about their energy spectrum. In order to do this we will use so-called Rigged Hilbert 


Spaces.*" 


Definition. The Hamiltonian for a periodic potential is of the form 
h2 
H = —-—A+V 
2m Te); 
with 


(i) Periodicity in V(x), i.e. V(x +a) =V(z) for all x € R? where a is the periodicity of 
the system 


(ii) V is pointwise continuous and bounded. 


V(x) 


—_—————————-—_—_ 
7 IR 

As we shall see, by making no assumptions apart from the above, we will be able to 
extract a remarkable generic conclusion about the energy spectrum of particle moving in 
a generic periodic potential. This is truly a amazing result as the potential can even be 
discontinuous (countably) infinite times! A huge application of this formalism is in the 
study of solid state physics, where the periodic potential comes from that generated by a 
regular lattice of so-called lattice constant a, 


As we shall see the general result is that the energy spectrum comes in continuous, 


open intervals in R, known as bands. 


E 


“7T am currently reading up on these, and will add an additional section to the end of these notes once I 


nave a better 1dea on them. 


eo) ae 


20.1 Basics of Rigged Hilbert Space 


48 in order to find 


As mentioned at the start, we wish to make use of rigged Hilbert spaces 
the spectrum of the energy observable. The basic reason behind this is that rigged Hilbert 


spaces essentially extend what we usually think of as the eigenvalue equation 
Hy = Ey, 


where F is a discrete value in R, to the case where EF can be continuous. This is known as 
the generalised eigenvalue equation. 

We do this because ultimately we know that the spectrum will be continuous intervals, 
however even if it was purely discrete, or a combination of both, the theory of rigged Hilbert 
spaces would still account for this. 

The basic idea behind rigged Hilbert spaces is to consider elements WV that satisfy the 
generalised eigenvalue equation, but do not lie in L?(R%). It turns out that they lie in the 
adjoint of a densely defined subspace, which for us is the Schwartz space. In this way we 
construct our so-called Gelfand Triple: 


S(R?) L?(R¢)  $*(R®). 
Cc Cc 
The easiest way to see that we need such a construction here is that, as the Hamiltonian 


is constructed using derivative operators, its eigenvalues are likely to be of the form 
UW xe , 
which is clearly not square integrable (it’s modulus is 1). 


Remark 20.1. It is important to note here that a rigged Hilbert space is not some extension 
of the physics or of quantum mechanics, but indeed it is the most natural mathematical 
structure required in order to study quantum mechanics. In fact it is the rigged Hilbert 
space structure which provides the full mathematical foundation in order to understand 
Dirac’s bra-ket notation, and it introduces the well known Dirac delta function. This gives 


the insight into, what a rigged Hilbert space is — it is the equipping (i.e. the ‘rigging’ 
eerie Space with a theory fe) aban quipping ( eging’) 


Proposition 20.2. Any H-eigenvector V € S*(R%)\L?(R2) has a purely continuous energy 
spectrum. 


20.2 Fundamental Solutions and Fundamental Matrix 


Definition. A set of solutions {¥1,...,~n} for a system of linear, homogeneous ordinary 
differential operators is known as a fundamental set of solutions if 


(i) They are linearly independent 


(ii) Any other solution to the ODEs can be expanded using the set {w1,...,Wn}. 


48 Again, coming soon! 


= 188 = 


In other words, they form a basis for the solution space. 


Proposition 20.3. The cardinality of the fundamental set of solutions of a system of n-th 
order ODEs is n. 


Example 20.4. Let the system of ODEs just be the single equation 


(x) + w*o(x) = 0, 
for some non-vanishing w € R. The the fundamental set of solutions is 
1 = cos(wx) and Wo = sin(wx). 
It is easy enough to see that these two solutions do indeed form a basis for the solution 
space. 


Lemma 20.5. Let {v,...,Wn} be a set of fundamental solutions for some system of ODEs. 
Then the set {c1,...,Cntmn} for co € F (the underlying field) is also a set of fundamental 
solutions. 


As our Hamiltonian is a second order ODE there are 2 fundamental solutions. These 


fundamental solutions depend on the value of EF and so we label them as {wf, ~#}. We 
remove the ambiguity in the coefficients by requiring 


vE(0)=1, (w#)’(0) =0 
pz(0)=0, — (bF)’(0) = 1. 


Definition. An entire function is a complex valued function that is holomorphic (C- 


| 


| 


differentiable in the neighbourhood of a point) at all finite points in the complex domain. 
Theorem 20.6. The fundamental solutions WE and WF are entire functions on E. 


Remark 20.7. Note in order to make the above theorem true, we require that our eigenvalues 
are complex, EF € C. This is clearly unphysical, however do this here in order to exploit the 
strong results of complex analysis, and then we shall restrict ourselves to EF € R at the end. 


Definition. The fundamental matrix of a system of linear, homogeneous ODEs, with the 
fundamental set of solutions {q1,...,Wn} is the matrix 


W(x). Pp(z) 


M(a) = 


P(x). Pa) 
Lemma 20.8. The determinant fundamental matrix for our system is constant*? 


det M¥ (x) =0. 


49 


Note we also label /@ witha —e Ly, 


= Lao 


Proof. By direct calculation, and using the generalised eigenvalue equation, 
wi (bE)’ — P)'e3]"(@) = (HP) (wa)! + OF 2)" — (2) TY! — PYF] (a) 
luE( -Pwe-vyw8) - (-Re-viek ef] @ 


= 0. 


| 


O 


Corollary 20.9. It follows trivially from the conditions we placed on WE and wk that 
det M¥ (x) = 1. 


20.3 Translation Operator 


Definition. The translation operator is the linear operator such that its action on an 
element in its domain is given by 


(T4)(z) = Ya) = ¥(e +), 


forsomea €C. 


Proposition 20.10. Let T be the translation operator with a being the periodicity of our 
system. Then the T commutes with the Hamiltonian, 


[P| =0; 
Proof. Consider the action on an element wy in the codomain of both operators, 


[T, H\p(x) = THY) (2) — A(T) (2) 


h2 e% 
_ ( -” ares vre) (x) — H(d)(2) 


= H(T%)() — H(T#)(2) 
0, 


| 


where we have used the fact that the translation by a constant doesn’t effect the result of 
differentiating a function and the fact that T is linear. LJ 


Lemma 20.11. Let W” be a H-eigenvector of our system. Then ye = Tw" is also a 
Al-eigenvector. 


Proof. From the commutation result, 


H(Ty")(2) = T(Hb")(2) 
Hip" (x) = T(Bb")(a) 
= E(Tp*)(2) 

= Ey*(z) 


= 190 = 


So we have that the translated solution is also a solution. We now use the fact that 
{Wl, WF} is a fundamental set of solutions to expand the translated fundamental solution, 


2 
vy (eta)= — abyhj'(a), 


i=1 


for j = 1,2. S- 


Now consider the case for x = 0, then we have 


7 (a) = a Wt (0) + a7 5y4'(0) 


1 
=a 3; 


and similarly 
(b7’)'(a) = a7; 

from which it follows that . 
= (M*(a))" 


So we have that a general solution is of the form 


7 


2 2 


wP(a +a) =>) A(M*(a))' bP (2), 


£=1 i=1 
for Al, A? EC. 


Remark 20.12. As we showed previously the translation operator and the Hamiltonian com- 
mute, and so they share common eigenvectors. We shall label these eigenvectors as follows 


Hy’ = Ey! 
Tye mes AWE, 


We can reformulate the second eigenvalue equation as follows: using Ty"*(x) = 
WEA (a + a) we have 


b’(a +a) = ay ) 


2 2 
DAM MP (a) eb? (a =A SAW ) 


where we have introduced a X label (not an index!) on A to indicate that it corresponds 
to a specific A. Now using the linear independence of the fundamental solutions, we can 
equate coefficients, then noting that the LHS is just matrix multiplication we see that we 
can write A) as a column matrix with two entries 


AX 
A, = (33) ’ 


which is an eigenvalue of the fundamental matrix at a with eigenvalue X. 


=191= 


If we let A, and Az be the two eigenvalues of M(a), then there is some basis such that 
Ay 0 
M*¥ (a) = 
(a) ( : ‘] | 


det M¥ (a) = 1 - Ao 
Tr M*(a) =,+ ro. 


50 
from which we have 


Then using the result det M”(a) = 1, we have the condition 


and we just want some way to find the trace to work out what the second condition is. 


20.4 Application to Our Quantum Problem 


Theorem 20.13 (Floquet®!). Let V(x) be a complez, piecewise continuous, periodic func- 


tion with minimum period 7. Then the solutions to the ODE 
y" +V(a)y=0 
are linearly independent and can be written as 


fi(x) = e*p, (x) 
fo(a) =e" po(z), 


for 0 © |—1a, 7) and where p(x) and po(x) are periodic with the same period as V(x). 


Remark 20.14. Floquet’s theorem also tells us that the eigenvalues are simply 
M=e” and A=”, 
We have, then, that 
pE (a) + pF (a) = Tr M¥ (a) =e”? + e~” = Ico. 


We now define the function 


NlRe 


qw(E) = 5 (vr (a) + ¥z(a)) = cos 6, 


where the subscript indicates that its for the type of potentials we’re considering. 


°°Note these results are basis independent for any transformation given by an endomorphism. 
°' Based on a combination of the one given in the lectures and the one given on Wolfram. 


=192— 


20.5 Energy Bands 
Recall that ~? and 7 are entire functions of E, and so their sum is also an entire function. 


From this it follows that the restriction of yy to the reals, 


Wwr:R C, 


is at least smooth. Thus we know that | = 


(i) If (Eo, 60) solves the equation yy(Eo) = cos, then any FE in a sufficiently small 
neighbourhood of E9 solves the equation yy(E) = @ for some 0. 


(ii) Equivalently, if £; does not solve yy(E1) = cos(01) for any 6; € ([7, 7), then no E is 
any sufficiently small neighbourhood will. 


From these conditions we can draw the remarkable conclusion: For any periodic, piece- 
wise continuous and bounded potential, the energy spectrum is a countable union of open 
intervals, known as energy bands. 


Remark 20.15. Note the fact that we only have continuous parts to our spectrum is con- 
sistent with our rigged Hilbert space ideas; the functions w”* contain a phase factor and 


then a periodic function, and so are not square integrable, but they are bounded and so are 
elements of ORE) \ LP CRe). " _ 


20.6 Quantitative Calculation (Outline) 


To obtain the precise intervals (bands) one needs to find (perhaps numerically) the funda- 


mental solutions. One obtains, for instance, for a potential of the form 


= 19S 


21 Relativistic Quantum Mechanics 


As we shall see the transition into relativistic quantum mechanics is highly non-trivial; in 
the sense that we don’t simply add a new term onto our expressions that accounts for the 
relativistic effects. This lecture is meant as a very brief overview/introduction to quantum 


field theory, and so does not claim to be self contained in any sense. The main idea we 
want to highlight is how the ideas change once we start accounting for relativistic effects, 


and what the repercussions of those changes are. 


21.1 Heuristic Derivation of the Schré6dinger Equation 


Schrédinger recognised that he could obtain the Schrédinger equation from the classical 
energy-momentum relation 


using the substitutions 


E wn~~ tho;, Panne —thdg 

giving 2 

thoy = ———0,0°" + Vy, 
2m 


for py: R° + C. 
If we want to get the probabilistic interpretation that quantum mechanics is built on, 
we need to introduce an object that 


(i) Is non-negative definite, 
(ii) Integrates to unity, 
(iii) This integral doesn’t change in time. 


As we have been using all along the such needed object is 


p= |wh. 


We might ask ourselves ‘How does one come up with the idea to such an object?’, the 
answer for which comes from the following. 

Firstly its clear that p > 0, by definition of the inner product. We can also always 
arrange for the integral over all space to be unity by normalisation. Now consider the 
Schrédinger equation and its complex conjugate 


2 
iio 94 Va 
2m 
_ h2 = = 
~ihOpb = ——,0°% + Vo. 
2m 


— 194 — 


If we multiply the former from the right by w and the latter from the left by 7, then subtract 
the two results we arrive, after rearranging a bit, at 


(09 + VOB) = (2.0%) ~ (Bd*W) 

OX (Wd) = Bde (Oe) — (Beas. 
| (0) = Ba (OH) — (OAV 
p= wy, j= —F [vrv) - O44], 


we arrive at the continuity equation 


Then defining 


from which it follows that 


a | Bap(e) = | #aa.o\(a 


='(), 


where we have used the fact that we are fie ate over all of R? (which has no bound- 
ary/surface) with the fact that 0,7% is a purely surface term. 


21.2 Heuristic Derivation of the Relativistic Schrédinger Equation (Klein- 
Gordan Equation) 


A note is made on the structural difference between non-relativistic spacetime and the 
spacetime of general relativity. For a much deeper discussion of this the reader is directed 
to Dr. Schuller’s International Winter School on Gravity and Light®? 

Schroédinger, quite courageously, tried to apply the method of the previous section to 


the relativistic energy-momentum relation (c = 1 here), 


2 2 2 
B=p+m, 


which gives 
=f 0; = =17050° =m, 


which, after rearranging, gives the so-called Klein-Gordan equation>® 
(+ m*)¢ = 0, 

where we have introduced the d’Alembert operator 
O := 07 — 0,0%. 


52 Available via YouTube. 
°31t is named such as Oskar Klein and Walter Gordan also arrived at this result after Schroddinger, and 
proceeded to try and interpret it as the description of relativistic electrons, which we shall see shortly is 


not the case. 


=195= 


The question we now have to ask is ‘can we still obtain some probability interpretation 
using ¢?’ The answer is no, as we shall now show. 

In correlation to the non-relativistic case, in order to ensure the integral is a constant 
in time, we wish to find a J“ such that 


Ou JH = 00J0 + OaJa = 0). 


Again similarly to before, by considering the Klein-Gordan equation and its complex con- 
jugate we arrive at 


Jt = (d46)p — o(0"d), 
and Gauss’ theorem tells us that the only candidate for the probability amplitude p is 


p:= J°:= (0°d)b — ¥(O"Y). 


This all looks fine, in fact it looks exactly like the non-relativistic case. However there is one 
subtle, yet highly important, difference. In the Schrédinger equation we had only first order 
time derivatives, whereas the Klein-Gordan equation is second order in time derivatives. 
This means that for the latter we can prescribe as initial conditions not only ¢(t) but its 
Je EY Sotie’tthe Wilh violatés the Siterprotation of pras.a probalsniey dousityy This ss 
a problem that cannot be removed at this level, the reason for which we shall soon see. 


Remark 21.1. Historically, Schrédinger actually arrived at the relativistic equation first (as 
he knew this was ultimately where he wanted to go), however when running into the problem 
highlighted above he decided instead to consider the non-relativistic case and ended up with 
the Schrodinger equation. 


Remark 21.2. Note some people often refer to the time and spatial derivatives being on an 
equal footing in the Klein-Gordan equation. This is a highly misleading choice of words. 
They are not on an equal footing, as the temporal derivatives come with a positive sign, 
whereas the spatial ones come with a negative sign. This is not just some little difference 
to be brushed over. Indeed, without this minus sign stems from Maxwell’s equations, and 
without it the Klein-Gordan equation would be physically useless; the gist being that if 
time and space were on an equal footing then we would not be able to predict the future, 
which is the main driving force of physics. What people should say is that they are ona 
similiar footing. 


21.3 Dirac Equation 


So, as we have seen, it is this second derivative that causes us the problems, so the natural 
question is ‘How do we avoid it?’ The immediate answer is to try considering 


B= /p?+m?, 


and then using the substitutions as above. However this is no better as now the RHS is 
the square root of a differential operator, and so in the expansion about m? we will end up 


with a theory with infinite spatial derivative order. Not good at all! 


=196— 


Dirac then asked the question ‘What if I use a different substitution prescription such 
that the whole of the RHS becomes a single order derivative?’ Following this thought, after 
several calculations, he arrives at the so-called Dirac equation 


(iy*0, mi4)¥ =0, 
where the 7“ are 4 x 4 matrices satisfying the anticommutation relation 
ee aay ey Sea 
known as the Dirac algebra, where n’” is the Minkowski metric given by™4 
ne’ = diag(—1,1,1,1). 


The vector W here is a 4 component object known as a spinor, where (after some work) we 
can think of two of the components being a particle and the other two being the associated 
antiparticle. 

So the Dirac equation introduces antimatter into the mix, however it turns out that it 
still doesn’t fix the probability problem addressed by the Klein-Gordan equation! 


21.4 The Deep Route of the Problem 


It turns out the problem stems from perhaps the most well-known equation in the world... 
E = mc’, which tells us that we need not conserve particle number. For example the 
interaction between two particles could result in any number of resulting particles (provided 
the energy is in sufficiently large) 


If this information is contained within E = mc’, it is also contained in the relativistic 
energy-momentum equation E? = p? + m?c*, and so we were unjustified to look for an 
equation in the relativistic context that describes one (or any fixed number) of particles. 
In other words, our theory needs to account for this non-conservation of particle number, 
including have no particles (i.e. the vacuum). Mathematically the idea is to construct a 
direct sum of Hilbert spaces, each of which describes a different number of particles. This 


is known as the Fock space, 


F :=C@HO(H@H) O(H@H@H)S... 


~ Using the (-,+,+,+) signature. 


10 T= 


We can then construct the inner product on ¥ from the inner products on the Hilbert 
spaces, all of which are obtained from the inner product on H. That is if =, yp € F given by 


~p=an Pay B Ss at! (425 ® W253) ® 
aj 
p = b0 G blyl © __ b(p2i @ ys) @..., 
4] 
then the inner product is given by 
(|e) = Gobo + Gibs (bilyi)y, + >~ ake ( (hoi @ Waj|Por @ P20) gay t+ + 
ijke 
Remark 21.3. For the Klein-Gordan equation it is actually the symmetrised tensor products 
we need, giving this symmetric Fock space 


OF :=CO(HOH) G@(HOHOH)® 


and it describes bosonic systems. 
Similarly for the Dirac equation we require the anti-symmetrised tensor product, giving 


MF :=C@(HAH) ®@(HAHAH) © 
which describes fermionic systems. 


We claim that once you lift the Klein-Gordan and Dirac equations onto their respective 
Fock spaces, that the problem of negative probability density vanishes. 


21.5 Feynman Diagrams 


Richard Feynman developed a pictorial representation of the highly complicated interaction 
of particles. The basic idea is to take a perturbation expansion of the interaction and 
represent each order by a set of diagrams. The order of the diagram is associated with the 
number of so-called vertices present. 

For example the first few terms for the process of electron-electron scattering would be 
drawn as: 


=e 


The first diagram (which has no vertices) is the zeroth order diagram, the second one 
(with two vertices) is the second order diagram and the last two (which both have 4 vertices) 
are the fourth order diagrams. The furthest most left and right arrows (i.e. the ones that 
have a non-vertexed end) are known as external lines, and the other ones are known as 
internal lines. Particles represented by internal lines are often referred to virtual particles. 


Remark 21.4. One should be careful when it comes to drawing the arrows on the internal 
lines, however, as (unless the rest of the diagram indicates otherwise) we could have a 
particle or an antiparticle (whose arrow points the opposite way). It is for this reason that 
the so-called loop in the third diagram does not have arrows. On the final diagram we do 
draw the arrows, as conservation of electric charge forces us to ensure our virtual particles 
are electrons (not positrons, the anti-electron). 

Note also on loop internal lines we have simply written e and not e~ or e* (the positron), 
this further indicates that we do not know which is which, only that one must be an electron 
and the other a positron. 


Mathematically Feynman diagrams correspond to integral equations, and there is a 
set of rules (cleverly named Feyman Rules) which tell you how to convert the diagrams 
into these integrals. They are a indispensable tool when it comes to studying relativistic 
particle physics, as not only are they quicker to draw then writing out integrals, they have 
some incredibly elegant properties (such as so-called crossing symmetry) which make the 
calculations significantly easier. However, we shall not go into any more detail here; the 
unfamiliar reader is directed to the massive resource of information in textbooks/on the 
internet. 


= 199 = 


FURTHER READINGS 


Mathematical quantum mechanics 


e Ballentine, Quantum Mechanics: A Modern Development (Second edition), World 


Scientific 2014 
e Faddeev, Yakubovskii, Lectures on Quantum Mechanics for Mathematics Students, 


American Mathematical Society 2009 


e Folland, Quantum Field Theory: A Tourist Guide for Mathematicians, American 
Mathematical Society 2008 


e Gieres, Mathematical surprises and Dirac’s formalism in quantum mechanics 
https: //arxiv.org/abs/quant -ph/9907069 


e Hall, Quantum Theory for Mathematicians, Springer 2013 
e Mackey, Mathematical Foundations of Quantum Mechanics, Dover Publications 2004 


e Moretti, Spectral Theory and Quantum Mechanics: With an Introduction to the Al- 
gebraic Formulation, Springer 2013 


e Parthasarathy, Mathematical Foundations of Quantum Mechanics, Hindustan Book 
Agency 2005 


e Strocchi, An Introduction to the Mathematical Structure of Quantum Mechanics: A 
Short Course for Mathematicians, World Scientific 2008 


e Takhtajan, Quantum Mechanics for Mathematicians, American Mathematical Society 
2008 


Standard quantum mechanics textbooks 


e Griffiths Introduction to Quantum Mechanics 


Linear Algebra 
e Friedberg, Insel, Spence, Linear Algebra (4th Edition), Pearson 2002 


e Janich, Linear algebra, Springer 1994 
e Lang, Linear Algebra (Third edition), Springer 1987 


e Shakarchi, Solutions Manual for Lang’s Linear Algebra, Springer 1996 


Topology 
e Adamson, A General Topology Workbook, Birkhauser 1995 


e Kalajdzievski, An Illustrated Introduction to Topology and Homotopy, CRC Press 2015 


* Munkres, Topology (Second edition), Pearson 2014 


= 200 


Functional analysis 


e Aliprantis, Burkinshaw, Principles of Real Analysis (Third Edition), Academic Press 
1998 


e Aliprantis, Burkinshaw, Problems in Real Analysis: A Workbook with Solutions, Aca- 
demic Press 1998 


e Day, Normed Linear Spaces, Springer 1973 
e Halmos, A Hilbert Space Problem Book, Springer 1982 


e Hunter, Nachtergaele, Applied Analysis, World Scientific, 2001 


Kadison, Ringrose, Fundamentals of the Theory of Operator Algebras. Volumes I-II, 
American Mathematical Society 1997 


e Leoni, A first Course in Sobolev Spaces, American Mathematical Society 2009 


Rynne, Youngson, Linear Functional Analysis (Second Edition), Springer 2008 


e Serov, Fourier Series, Fourier Transform and Their Applcations to Mathematical 
Physics Springer 2017 


Measure theory and Integration 
e Bartle, A Modern Theory of Integration, American Mathematical Society 2001 


e Bartle, Solutions Manual to A Modern Theory of Integration, American Mathematical 
Society 2001 


e Halmos, Measure Theory, Springer 1982 


e Nelson, A User-friendly Introduction to Lebesgue Measure and Integration, American 
Mathematical Society 2015 


e Rana, An Introduction to Measure and Integration, American Mathematical Society 
2002 


